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Nucleotide Sequences Encoding Maize RAD51 

Background of the Invention 

5 Transgenic plant produce development by conventional transformation and breeding 

efforts is a slow and unpredictable process. Gene targeting systems can overcome 
problems with expression variability, unpredictable of impacts of random gene insertion 
on agronomic performance, and the large number of experiments that need to be 
conducted. Such systems can also provide approaches to manipulating endogeneous 

10 genes. Of course, targeting system requires the ability to focus the recombination 
process to favor recovery of desired targeting events. 

The natural cellular DNA repair and recombination machinery consists of a 
complex array of protein components interacting in a highly controlled manner to ensure 
that the fidelity of the genome is conserved throughout the many internal events or 

15 external stimuli experienced during each cell cycle. The ability to manipulate this 
machinery requires an understanding of how specific proteins are involved in the 
process, and how the genes that encode those proteins are regulated. Since the primary 
approaches to gene targeting involve recombinases, whether operating in their natural in 
vivo environment (as during normal recombination) or as part of schemes that involve 

20 pretreatment of substrates so as to associate DNA with a recombinase and increase 
efficiency of targeting (e.g., double D-loop), there is a continuing need to isolate and 
characterize the genes for these molecules. Because many different protein components 
may be involved in gene targeting, the availability of host-specific genes and proteins 
could avoid possible problems of incompatibility associated with molecular interactions 

25 due to heterologous components. 

Sequences for the bacterial RecA recombinase and functional homologs from 
yeast and several animal species have been disclosed in various publicly accessible 
sequence databases. Numerous publications characterizing these recombinases exist 
(see, e.g., Kowalczykowski et al., Annu. Rev. Biochem., 63:991-1043 (1994)). Reports 

30 of the use of bacterial RecA in association with DNA sequences to manipulate 

homologous target DNA, including improvement of the efficiency of gene targeting in 
non-plant systems, have been published (see, e.g., PCT published Patent Application 
Nos. WO 87/01730 and WO 93/22443). 
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- The catalysis of in vitro pairing and strand exchange between circular viral single 
strand DNA ("ss DNA") and linear duplex DNA ("ds DNA") by a RAD51 recombinase 
from S. cerevisiae has also been reported (see, e.g., Sung, Science, 265:1241-43 (1994); 
Kanaar, et aL. Nature 391:335-338 (1998); Benson, et al Nature 391:401-410 (1998)). 
5 To date, work with recombinase enzymes in plants, however, has been very limited. 
Accordingly, there is an ongoing need for the identification and characterization of the 
functional activities of recombinase enzymes which may offer improved and expanded 
methods for use in plant systems, particularly agriculturally important crop species such 
as maize. 

10 

Summary of the Invention 

Polynucleotide sequences, which encode putatively active RAD51 recombinases, 

have been isolated from maize. Specifically, cDNA clones ZmRADSIA (SEQ ID NOS: 1) 

and ZmRADSIB (SEQ ID NOS: 5) from a maize tassel library have been identified and 
15 sequenced. The cDNA sequences include 3 '-untranslated regions (SEQ ID NOS: 4 and 8) 

suitable for use in making gene-specific probes, e.g., which can be used to map the locus of 

the respective ZmRADSl gene in an RFLP map of a maize population. The RFLP probes 

typically at least 15 nucleotide residues, although smaller and larger sizes may also be used. 

The present invention also includes expression cassettes, vectors, and host cells that 
20 incorporate the ZmRADSl genes. Monocot cells, such as maize cells, are particularly 

preferred as host cells. In addition, a nuclear localization sequence comprising the 5' end 

of the ZmRADSl gene is identified. 

In a further aspect, the present invention relates to an isolated protein comprising a 

polypeptide of at least 10 contiguous amino acids encoded by the isolated nucleic acid of 
25 ZmRADSIA or ZmRADSIB. In some embodiments, the polypeptide has a sequence 

selected from the group consisting of SEQ ID NOS: 3 and 7. 

In yet another aspect, the present invention relates to a transgenic plant comprising a 

expression cassette comprising a plant promoter operabiy linked to any of the isolated 

nucleic acids of the present invention. Methods for modulating, in a transgenic plant, the 
30 expression of the nucleic acids of the present invention are also included. In some 

embodiments, the transgenic plant is Zea mays. The present invention also provides 

transgenic seed from the transgenic plant. 
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In a further aspect, the present invention relates to a method of making maize 
recombinase by transforming or transfecting a host cell with an expression vector 
containing one of the isolated nucleic acids of the present invention and purifying the 
recombinase protein from the host cell. In some embodiments, the host cell is a bacterial 
5 cell, a yeast cell, or a plant cell. 



Brief Description of the Drawings 

Fig. 1 shows a map of a plasmid designated PHP8060 derived from the insertion of 
a modified ZmRADSIA gene between a maize ubiquitin promoter and a potato proteinase 
10 inhibitor ("PinlT) terminator in a pUC19 plasmid backbone. 

Fig. 2 shows a map of a plasmid designated PHP8103 derived from the insertion of 
a modified ZmRADSIB gene between a maize ubiquitin promoter and a potato proteinase 
inhibitor ("PinlT) terminator in a pUC19 plasmid backbone. 

Fig. 3 shows a map of a plasmid designated PHP8744 derived from the insertion of 
15 a GFPm gene 5' to the start of the modified ZmRADSIA gene in PHP8060 to create a 
sequence encoding a GFP/ZmRAD51 A fusion protein. Optionally, the modified 
ZmRADSIB gene could be placed instead of the modified ZmRADSIA gene, to form a 
GFP/ZmRAD51B fusion gene. 



20 Detailed Description of the Invention 

Full-length cDNA clones for two maize homologs of the yeast RAD51 gene have 
been isolated. Significant transcription levels have been detected primarily in immature 
ears and anthers that contain cells progressing through the early stages of meiosis. The 
two isolated cDNAs, however, are more closely related to RAD 51 family members 

25 expressed in mitotic cells than to the meiosis-specific homologs from plants (LIM15) and 
yeast (DM CI). RFLP mapping indicates that the Zea mays genome contains two genes 
encoding different variants of the ZmRADSl recombinase enzyme. The genes encoding 
each protein {ZmRADSIA and ZmRADSIB) are unlinked, and their map positions do not 
correspond to any known maize mutations with a meiotic phenotype. In addition to 

30 providing nucleotide sequences, which can be used to produce substantially purified 
RAD51 proteins, the results presented herein indicate that sequences from the 
ZmRADSIA and ZmRADSIB cDNA clones can serve both as sources of hybridization 
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probes for RAD51 -related genes, as well as novel and unique RFLP probes for 
applications such as mapping or marker-assisted selection in maize. 

The isolated polynucleotides and polypeptides of the present invention can be 
used over a broad range of plant types, particularly monocots such as the species of the 
5 family Gramineae including Hordeum, Secale, Triticum, Sorghum (e.g., S. bicolor) and 
Zea (e.g., Z. mays). The isolated nucleic acid and proteins of the present invention can 
also be used in species from the genera: Cucurbita, Rosa, Vitis, Juglans, Fragaria, 
Lotus, Medicago, Onobrychis, Trifolium, Trigonella, Vigna, Citrus, Linum, Geranium, 
Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, Atropa, Capsicum, 
10 Datura, Hyoscyamus, Lycopersicon, Nicotiana, Solarium, Petunia, Digitalis, Majorana, 
Ciahorium, Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Heterocallis, 
Nemesis, Pelargonium, Panieum, Pennisetum, Ranunculus, Senecio, Salpiglossis, 
Cucumis, Browaalia, Glycine, Pisum, Phaseolus, Lolium, Oryza, and Avena. 

Nucleotide Sequence Encoding ZmRADSIA & ZmRADSIB Proteins 

15 As used herein, "nucleic acid" includes reference to a deoxyribonucleotide or 

ribonucleotide Dolvmer in either sinsle- or double-stranded form, and unless otherwise 
limited, encompasses known analogues having the essential nature of natural nucleotides 
in that they hybridize to single-stranded nucleic acids in a manner similar to naturally 
occurring nucleotides (e.g., peptide nucleic acids). 

20 By "nucleic acid library" is meant a collection of isolated DNA or RNA 

molecules which comprise and substantially represent the entire transcribed fraction of a 
genome of a specified organism. Construction of exemplary nucleic acid libraries, such 
as genomic and cDNA libraries, is taught in standard molecular biology references such 
as Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in 

25 Enzymology, Vol. 152, Academic Press, Inc., San Diego, CA (Berger); Sambrooker aL, 
Molecular Cloning - A Laboratory Manual, 2nd ed., Vol. 1-3 (1989); and Current 
Protocols in Molecular Biology , F.M. Ausubel et aL, Eds., Current Protocols, a joint 
venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc. 
(1994). 

30 The terms "isolated" refers to material, such as nucleic acid or protein, which is: 

(1) substantially or essentially free from components that normally accompany or interact 
with it as found in its naturally occurring environment. The isolated material optionally 
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comprises material not found with the material in its natural environment; or (2) if the 
material is in its natural environment, the material has been synthetically (non-naturally) 
altered by deliberate human intervention to a composition and/or placed at a location in 
the cell (e.g., genome or subcellular organelle) not native to a material found in that 
5 environment. The alteration to yield the synthetic material can be performed on the 
material within or removed from its nacural state. For example, a naturally occurring 
nucleic acid becomes an isolated nucleic acid if it is altered, or if it is transcribed from 
DNA which has been altered, by means of human intervention performed within the cell 
from which it originates. See, e.g., Compound and Methods for Site Directed 

10 Mutagenesis in Eukaryotic Cells, Kmiec, U.S. Patent No. 5,565350; In Vivo 

Homologous Sequence Targeting in Eukaryotic Cells; Zarling et aL, PCT/US93/03868. 
Likewise, a naturally occurring nucleic acid (e.g., a promoter) becomes isolated if it is 
introduced by non-naturally occurring means to a locus of the genome not native to that 
nucleic acid. Nucleic acids, which are "isolated", as defined herein, are also referred to 

15 as "heterologous" nucleic acids. 

As used herein "operably linked" includes reference to a functional linkage 
between a promoter and a second sequence, wherein the promoter sequence initiates and 
mediates transcription of the DNA sequence corresponding to the second sequence. 
Generally, operably linked means that the nucleic acid sequences being linked are 

20 contiguous and, where necessary to join two protein coding regions, contiguous and in 
the same reading frame. 

As used herein, "polynucleotide" includes reference to a 
deoxyribopoly nucleotide, ribopoly nucleotide, or analogs thereof that have the essential 
nature of a natural ribonucleotide in that they hybridize, under stringent hybridization 

25 conditions, to substantially the same nucleotide sequence as naturally occurring 
nucleotides and/or allow translation into the same amino acid(s) as the naturally 
occurring nucleotide(s). A polynucleotide can be full-length or a subsequence of a native 
or heterologous structural or regulatory gene. Unless otherwise indicated, the term 
includes reference to the specified sequence as well as the complementary sequence 

30 thereof. Thus, DNAs or RNAs with backbones modified for stability or for other reasons 
are "polynucleotides" as that term is intended herein. Moreover, DNAs or RNAs 
comprising unusual bases, such as inosine, or modified bases, such as tritylated bases, to 
name just two examples, are polynucleotides as the term is used herein. It will be 
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appreciated that a great variety of modifications have been made to DNA and RNA that 
serve many useful purposes known to those of skill in the an. The term polynucleotide as it 
is employed herein embraces such chemically, enzymatically or metabolically modified 
forms of polynucleotides, as well as the chemical forms of DNA and RNA characteristic of 
5 viruses and cells, including among other things, simple and complex cells. 

The terms "polypeptide", "peptide" and "protein" are used interchangeably herein 
to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in 
which one or more amino acid residue is an artificial chemical analogue of a 
corresponding naturally occurring amino acid, as well as to naturally occurring amino 

10 acid polymers. The essential nature of such analogues of naturally occurring amino 
acids is that, when incorporated into a protein, that protein is specifically reactive to 
antibodies elicited to the same protein but consisting entirely of naturally occurring 
amino acids. The terms "polypeptide", "peptide" and "protein" are also inclusive of 
modifications including, but not limited to, glycosylation, lipid attachment, sulfation, 

15 gamma-carboxylationof glutamic acid residues, hydroxylation and ADP-ribosylation. It 
will be appreciated, as is well known and as noted above, that polypeptides are not always 
entirely linear. For instance, polypeptides may be branched as a result of ubiquitination, 
and they may be circular, with or without branching, generally as a result of posttranslation 
events, including natural processing event and events brought about by human manipulation 

20 which do not occur naturally. Circular, branched and branched circular polypeptides may 
be synthesized by non-translation natural process and by entirely synthetic methods, as 
well. Furher, this invention contemplates the use of both the methionine-containing and 
the methionine-less amino terminal variants of the protein of the invention. 

The following terms are used to describe the sequence relationships between two 

25 or more nucleic acids or polynucleotides: (a) "reference sequence" , (b) "comparison 
window", (c) "sequence identity", (d) "percentage of sequence identity", and (e) 
"substantial identity". 

(a) As used herein, "reference sequence" is a defined sequence used as a basis 
for sequence comparison. A reference sequence may be a subset or the entirety of a 

30 specified sequence: for example, as a segment of a full-length cDNA or gene sequence, 
or the complete cDNA or gene sequence. 

(b) As used herein, "comparison window" includes reference to a contiguous and 
specified segment of a polynucleotide sequence, wherein the polynucleotide sequence 
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may be compared to a reference sequence and wherein the portion of the polynucleotide 
sequence in the comparison window may comprise additions or deletions (i.e., gaps) 
compared to the reference sequence (which does not comprise additions or deletions) for 
optimal alignment of the two sequences. Generally, the comparison window is at least 
5 20 contiguous nucleotides in length, and optionally can be 30, 40, 50, 100, or longer. 
Those of skill in the art understand that to avoid a high similarity to a reference sequence 
due to inclusion of gaps in the polynucleotide sequence a gap penalty is typically 
introduced and is subtracted from the number of matches. 

Methods of alignment of sequences for comparison are well-known in the art. 

10 Optimal alignment of sequences for comparison may be conducted by the local homology 
algorithm of Smith and Waterman, Adv. Appl. Math. 2: 482 (1981); by the homology 
alignment algorithm of Needleman and Wunsch, J. Mol. Biol. 48: 443 (1970); by the 
search for similarity method of Pearson and Lipman, Proc. Natl. Acad.Sci. 85: 2444 
(1988); by computerized implementations of these algorithms, including, but not limited 

15 to: CLUSTAL in the PC/Gene program by Intelligenetics, Mountain View, California, 
GAP, BESTFIT, BLAST,, FASTA, and TFASTA in the Wisconsin Genetics Software 
Package, Genetics Computer Group (GCG), 575 Science Dr., Madison, Wisconsin, 
USA; the CLUSTAL program is well described by Higgins and Sharp, Gene 73: 
237-244 (1988); Higgins and Sharp, CABIOS 5: 151-153 (1989); Corpet, et aL, Nucleic 

20 Acids Research 16: 10881-90 (1988); Huang, et al, Computer Applications in the 
Biosciences 8: 155-65 (1992), and Pearson, et aL, Methods in Molecular Biology 24: 
307-331 (1994); preferred computer alignment methods also include the BLASTP, 
BLASTN, and BLASTX algorithms. Altschul, et aL, J. Mol. Biol. 215: 403-410 (1990). 
Alignment is also often performed by inspection and manual alignment. 

25 Unless otherwise stated, sequence identity /similarity values provided herein refer 

to the value obtained using the BLAST 2.0 suite of programs using default parameters. 
Altschul et al. , Nucleic Acids Res. 25:3389-3402 (1997). Software for performing 
BLAST analyses is publicly available, e.g., through the National Center for 
Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves 

30 first identifying high scoring sequence pairs (HSPs) by identifying short words of length 
W in the query sequence, which either match or satisfy some positive-valued threshold 
score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul et aL. supra). These 
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initial neighborhood word hits act as seeds for initiating searches to find longer HSPs 
containing them. The word hits are then extended in both directions along each sequence 
for as far as the cumulative alignment score can be increased. Cumulative scores are 
calculated using, for nucleotide sequences, the parameters M (reward score for a pair of 
5 matching residues; always > 0) and N (penalty score for mismatching residues; always 
< 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative 
score. Extension of the word hits in each direction are halted when: the cumulative 
alignment score falls off by the quantity X from its maximum achieved value; the 
cumulative score goes to zero or below, due to the accumulation of one or more 

10 negative-scoring residue alignments; or the end of either sequence is reached. The 
BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the 
alignment. The BLASTN program (for nucleotide sequences) uses as defaults a 
wordlength (W) of 11, an expectation (E) of 10, a cutoff of 100, M = 5, N =-4, and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 

15 defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring 
matrix (see Henikoff & Henikoff (1989) Prnc. Natl. Acad- ScL USA 89:10915). 

In addition to calculating percent sequence identity, the BLAST algorithm also 
performs a statistical analysis of the similarity between two sequences {see, e.g., Karlin 
& Altschul, Proc. Nat 'I. Acad. ScL USA 90:5873-5787 (1993)). One measure of 

20 similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), 

which provides an indication of the probability by which a match between two nucleotide 
or amino acid sequences would occur by chance. 

BLAST searches assume that proteins can be modeled as random sequences. 
However, many real proteins comprise regions of nonrandom sequences, which may be 

25 homopolymeric tracts, short-period repeats, or regions enriched in one or more amino 
acids. Such low-complexity regions may be aligned between unrelated proteins even 
though other regions of the protein are entirely dissimilar. A number of low-complexity 
filter programs can be employed to reduce such low-complexity alignments. For 
example, the SEG (Wooten and Federhen, Comput. Chem.. 17:149-163 (1993)) and 

30 XNU (Claverie and States, Comput. Chem., 17:191-201 (1993)) low-complexity filters 
can be employed alone or in combination. 

(c) As used herein, u sequence identity" or "identity" in the context of two 
nucleic acid or polypeptide sequences includes reference to the residues in the two 
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sequences which are the same when aligned for maximum correspondence over a 
specified comparison window. When percentage of sequence identity is used in 
reference to proteins it is recognized that residue positions which are not identical often 
differ by conservative amino acid substitutions, where amino acid residues are 
5 substituted for other amino acid residues with similar chemical properties (e.g. charge or 
hydrophobic ity) and therefore do not change the functional properties of the molecule. 
Where sequences differ in conservative substitutions, the percent sequence identity may 
be adjusted upwards to correct for the conservative nature of the substitution. Sequences 
which differ by such conservative substitutions are said to have "sequence similarity" or 

10 "similarity" . Means for making this adjustment are well-known to those of skill in the 
art. Typically this involves scoring a conservative substitution as a partial rather than a 
full mismatch, thereby increasing the percentage sequence identity. Thus, for example, 
where an identical amino acid is given a score of 1 and a non-conservative substitution is 
given a score of zero, a conservative substitution is given a score between zero and 1. 

15 The scoring of conservative substitutions is calculated, e.g., according to the algorithm 
of Meyers and Miller, Computer Applic. Biol ScL, 4: 11-17 (1988) e.g., as 
implemented in the program PC/GENE (Intelligenetics, Mountain View, California, 
USA). 

(d) As used herein, "percentage of sequence identity" means the value determined 
20 by comparing two optimally aligned sequences over a comparison window, wherein the 

portion of the polynucleotide sequence in the comparison window may comprise 
additions or deletions (i.e., gaps) as compared to the reference sequence (which does not 
comprise additions or deletions) for optimal alignment of the two sequences. The 
percentage is calculated by determining the number of positions at which the identical 
25 nucleic acid base or amino acid residue occurs in both sequences to yield the number of 
matched positions, dividing the number of matched positions by the total number of 
positions in the window of comparison and multiplying the result by 100 to yield the 
percentage of sequence identity. 

(e) (i) The term "substantial identity" of polynucleotide sequences means that a 
30 polynucleotide comprises a sequence that has at least 70% sequence identity, preferably 

at least 80%, more preferably at least 90% and most preferably at least 95%, compared 
to a reference sequence using one of the alignment programs described using standard 
parameters. One of skill will recognize that these values can be appropriately adjusted to 
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determine corresponding identity of proteins encoded by two nucleotide sequences by 
taking into account codon degeneracy, amino acid similarity, reading frame positioning 
and the like. Substantial identity of amino acid sequences for these purposes normally 
means sequence identity of at least 60%, more preferably at least 70%, 80%, 90%, and 
5 most preferably at least 95%. Polypeptides which are "substantially similar" share 

sequences as noted above except that residue positions which are not identical may differ 
by conservative amino acid changes. 

Another indication that nucleotide sequences are substantially identical is if two 
molecules hybridize to each other under stringent conditions. However, nucleic acids 

10 which do not hybridize to each other under stringent conditions are still substantially 
identical if the polypeptides which they encode are substantially identical. This may 
occur, e.g., when a copy of a nucleic acid is created using the maximum codon 
degeneracy permitted by the genetic code. One indication that two nucleic acid 
sequences are substantially identical is that the polypeptide, which the first nucleic acid 

15 encodes, is immunologically cross reactive with the polypeptide encoded by the second 
nucleic acid. 

(e) (ii) The terms "substantial identity" in the context of a peptide indicates that a 
peptide comprises a sequence with at least 70% sequence identity to a reference 
sequence, preferably 80%, more preferably 85%, most preferably at least 90% or 95% 

20 sequence identity to the reference sequence over a specified comparison window. 

Preferably, optimal alignment is conducted using the homology alignment algorithm of 
Needleman and Wunsch, J. Moi. BioL 48: 443 (1970). An indication thai two peptide 
sequences are substantially identical is that one peptide is immunologically reactive with 
antibodies raised against the second peptide. Thus, a peptide is substantially identical to 

25 a second peptide, for example, where the two peptides differ only by a conservative 
substitution. 

The terms "stringent conditions" or "stringent hybridization conditions" includes 
reference to conditions under which a probe will hybridize to its target sequence, to a 
detectably greater degree than other sequences (e.g., at least 2-fold over background). 
30 Stringent conditions are sequence-dependent and will be different in different 

circumstances. By controlling the stringency of the hybridization and/or washing 
conditions, target sequences can be identified which are 100% complementary to the 
probe (homologous probing). Alternatively, stringency conditions can be adjusted to 
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allow some mismatching in sequences so that lower degrees of similarity are detected 
(heterologous probing). Generally, a probe is less than about 1000 nucleotides in length, 
optionally less than 500 nucleotides in length. 

Typically, stringent conditions will be those in which the salt concentration is less 
5 than about 1.5 M Na ion. typically about 0.01 to 1.0 M Na ion concentration (or other 
salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., 
10 to 50 nucleotides) and at least about 60°C for long probes (e.g., greater than 50 
nucleotides). Stringent conditions may also be achieved with the addition of destabilizing 
agents such as formamide. Exemplary low stringency conditions include hybridization 

10 with a buffer solution of 30 to 35 % formamide, 1 M NaCl, 1 % SDS (sodium dodecyl 
sulphate) at 37°C, and a wash in IX to 2X SSC (20X SSC = 3.0 M NaCl/0.3 M 
trisodium citrate) at 50 to 55 °C. Exemplary moderate stringency conditions include 
hybridization in 40 to 45% formamide, 1 M NaCl, 1% SDS at 37°C, and a wash in 
0.5X to IX SSC at 55 to 60°C. Exemplary high stringency conditions include 

15 hybridization in 50% formamide, 1 M NaCl, 1 % SDS at 37°C, and a wash in 0. IX SSC 
at 60 to 65°C. 

Specificity is typically the function of post-hybridization washes, the critical 
factors being the ionic strength and temperature of the final wash solution. For DNA- 
DNA hybrids, the T m can be approximated from the equation of Meinkoth and Wahl, 

20 Anal. Biochem., 138:267-284 (1984): T m = 81.5 °C + 16.6 (log M) + 0.41 (%GC) - 
0.61 (% form) - 500/L; where M is the molarity of monovalent cations, %GC is the 
percentage of guanosine and cytosine nucleotides in the DNA, % form is the percentage 
of formamide in the hybridization solution, and L is the length of the hybrid in base 
pairs. The T m is the temperature (under defined ionic strength and pH) at which 50% of 

25 a complementary target sequence hybridizes to a perfectly matched probe. T m is reduced 
by about 1 °C for each 1% of mismatching; thus, T m , hybridization and/or wash 
conditions can be adjusted to hybridize to sequences of the desired identity. For 
example, if sequences with _>90% identity are sought, the T m can be decreased 10 °C. 
Generally, stringent conditions are selected to be about 5 °C lower than the thermal 

30 melting point (T m ) for the specific sequence and its complement at a defined ionic 
strength and pH. However, severely stringent conditions can utilize a hybridization 
and/or wash at 1, 2, 3, or 4 °C lower than the thermal melting point (TJ; moderately 
stringent conditions can utilize a hybridization and/or wash at 6, 7, 8, 9. or 10 °C lower 
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than the thermal melting point (T m ); low stringency conditions can utilize a hybridization 
and/or wash at 11, 12, 13, 14, 15, or 20 °C lower than the thermal melting point (T m ). 
Using the equation, hybridization and wash compositions, and desired T m , those of 
ordinary skill will understand that variations in the stringency of hybridization and/or 
5 wash solutions are inherently described. If the desired degree of mismatching results in 
a T m of less than 45 °C (aqueous solution) or 32 °C (formamide solution) it is preferred 
to increase the SSC concentration so that a higher temperature can be used. An 
extensive guide to the hybridization of nucleic acids is found in Tijssen, Laboratory 
Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic Acid 

10 Probes, Part I, Chapter 2 "Overview of principles of hybridization and the strategy of 
nucleic acid probe assays", Elsevier, New York (1993); and Current Protocols in 
Molecular Biology, Chapter 2, Ausubel, et a/., Eds., Greene Publishing and Wiley- 
Interscience, New York (1995). 

The present invention provides isolated nucleic acids comprising polynucleotides 

15 complementary to the polynucleotides of the ZmRAD51 polynucleotides. As those of 
skill in the art will recognize, complementary sequences base-pair throughout the entirety 
of their length with the polynucleotides (i.e., have 100% sequence identity over their 
entire length). Complementary bases associate through hydrogen bonding in double 
stranded nucleic acids. For example, the following base pairs are complementary: 

20 guanine and cytosine; adenine and thymine; and adenine and uracil. 

The present invention provides isolated nucleic acids comprising polynucleotides 
of the present invention, wherein the polynucleotides encode a protein having a 
subsequence of contiguous amino acids from a polypeptide of the present invention. The 
length of contiguous amino acids from the prototype polypeptide is selected from the 

25 group of integers consisting of from at least 10 to the number of amino acids within the 
prototype sequence. Thus, for example, the polynucleotide can encode a polypeptide 
having a subsequence having at least 10, 15, 20, 25, 30, 35, 40, 45, or 50, contiguous 
amino acids from the prototype polypeptide. 

The present invention provides subsequences comprising isolated nucleic acids 

30 contianing at least 15 contigous bases of the inventive sequences. The number of such 
subsequences encoded by a polynucleotide of the instant embodiment can be any integer 
selected from the group consisting of from 1 to 20, such as 2, 3, 4, or 5. The 
subsequences can be separated by any integer of nucleotides from 1 to the number of 
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nucleotides in the sequence such as at least 5, 10, 15. 25 , 50, 100, or 200 nucleotides. 
Subsequences of the isolated nucleic acid can be used to modulate or detect gene 
expression by introducing into the subsequences compounds which bind, intercalate, 
cleave and/or crossling to nucleic acids. Exemplary compound include acridine, 
5 psoralen, phenanthroline, naphthoquinone, daunomycin or chloroethylaminoaryl 
conjugates. The subsequences of the present invention can comprise structural 
characteristics of the sequence from which it is derived. Alternatively, the subsequences 
can lack certain structural characterisitics of the larger sequence from which it is derived 
such as a poly (A) tail. 

10 The proteins encoded by polynucleotides of this embodiment, when 

presented as an immunogen, elicit the production of polyclonal antibodies which 
specifically bind to a prototype polypeptide such as the ZmRAD5 1 polypeptides. 
Generally, however, a protein encoded by a polynucleotide of this embodiment does not 
bind to antisera raised against the prototype polypeptide when the antisera has been fully 

1 5 immunosorbed with the prototype polypeptide. Methods of making and assaying for 

antibody binding specificity/affinity are well known in the art Exemplary immunoassay 
formats include ELISA, competitive immunoassays, radioimmunoassays, Western blots, 
indirect immunofluorescent assays and the like. 

Nucleotide sequences encoding ZmRADSl have now been determined by methods 

20 described more fully in the Examples below. Briefly, DN A encoding ZmRADSl was 
obtained by screening a maize tassel library with a 360 bp probe isolated using a set of 
degenerate PCR primers designed from known RAD51 consensus sequences. The nucleic 
acid sequences and the corresponding deduced amino acid sequences for the two 
ZmRADSl recombinases are shown below in Tables I (SEQ ID NO: 1) and II (SEQ ID 

25 NO: 5). 

Tables I and II disclose the full nucleotide sequence of the cDNA clones for 
ZmRADSIA and ZmRADSIB, respectively. The ATG start of translation in each case is 
indicated in bold, as is the TGA translation stop codon. Both genes are 1020 nucleotides 
long, coding for polypeptides of 340 amino acids. The two maize genes exhibit 
30 substantial identity with each other, primarily in the coding portion: however they do 
diverge in sequence in the untranslated regions, a feature that allowed the identification 
of unique sequences suitable for making gene-specific PCR probes. In comparison to the 
other RAD51 genes, similarity is also high with the reported tomato sequence. This 
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similarity, except for conserved regions, decreases when comparisons are made to the 
animal RAD51 genes. Table V compares the actual % similarity and % identity in the 
polypeptide sequences among the different genes. The two maize RAD51 recombinases 
are over 94% similar to each other and 90% identical in the coding portions. Very 
5 similar values are observed when the maize polypeptide sequences are compared to 
tomato RAD51. The similarity drops to about 82% and identity to about 69% when 
comparing the two maize RAD51 recombinases to animal RAD51 recombinase 
sequences. 

The two cloned maize cDNAs offer both conserved sequences that can be used to 

10 recover other RAD 51 related genes, as well as unique sequences suitable for generating 
gene- or sequence-specific probes. The two ZmRADSl cDNAs were cloned into vectors 
and unique PCR amplified fragments were subsequently mapped along with an 
assortment of other RFLP probes onto previously constructed maize RFLP maps using 
different populations generated for this purpose. The vector PHP8057 contains the 

15 ZmRADSIA cDNA cloned into pBlueScript™ vector. The vector PHP8058 contains the 
ZmRADSIB cDNA cloned into pBlueScript™ . The specific sequences that were PCR 
amplified from vectors PHP8057 and PHP8058 and used as fragment probes for the 
mapping work are shown in Tables in (SEQ ID NO: 9) and IV (SEQ ID NO: 10). Only 
the sense strands are shown in these Tables. The regions corresponding to primers 

20 PHN10664 (5' primer for RAD51A, SEQ ID NO: 19), PHN10665 (5' primer for 

RAD 5 IB, SEQ ID NO: 20), and the sequence complement of PHN162 (3' primer for 
both) are underlined in the Tables. 

The ZmRADSIA gene was mapped in a MARSA (Marker Assisted Recombinant 
Selection A population) F4 population generated from crosses of maize lines R03 x N46. 

25 In the RFLP map of the MARSA population, the ZmRADSIA gene mapped to 

chromosome 7, about 40% down the length of the linkage group. In the RFLP map of 
the ALEB9 population, ZmRADSIB maps on chromosome 3 , about 25 % down the length 
of this linkage group. Each of the clone fragments mapped to a single locus making 
them useful reference markers for those positions on the linkage groups. 

30 Redundancy in the genetic code permits variation in the gene sequences shown in 

Table I and Table II. In particular, one skilled in the art will recognize specific codon 
preferences by a specific host species and can adapt the disclosed sequence as preferred 
for a desired host. For example, preferred codon sequences typically have rare codons 
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(i.e., eodons having a usage frequency of less than about 20% in known sequences of the 
desired host) replaced with higher frequency codons. Codon preferences for a specific 
organism may be calculated, for example, by utilizing codon usage tables available on 
the INTERNET at the following address: http://www.dna.affrc.go.jp/-nakamura/ 
5 codon.html. Codon usage in the coding regions of the polynucleotides of the present 
invention can be analyzed statistically using commercially available software packages 
such as "Codon Preference" available from the University of Wisconsin Genetics 
Computer Group (see Devereaux et ai, Nucleic Acids Res. 12: 387-395 (1984)) or 
Mac Vector 4.1 (Eastman Kodak Co., New Haven, Conn.). Thus, the present invention 

10 provides a codon usage frequency characteristic of the coding region of at least one of 
the polynucleotides of the present invention. The number of polynucleotides that can be 
used to determine a codon usage frequency can be any integer from 1 to the number of 
polynucleotides of the present invention as provided herein. Optionally, the 
polynucleotides will be full-length sequences. An exemplary number of sequences for 

15 statistical analysis can be at least 1, 5, 10, 20, 50, or 100. Nucleotide sequences which 
have been optimized for a particular host species by replacing any codons having a usage 
frequency of less than about 20% are referred to herein as u codon optimized 
sequences." 

Additional sequence modifications are known to enhance protein expression in a 
20 cellular host. These include elimination of sequences encoding spurious polyadenylation 
signals, exon/intron splice site signals, transposon-like repeats, and/or other such well- 
characterized sequences which may be deleterious to gene expression. The GC content 
of the sequence may be adjusted to levels average for a given cellular host, as calculated 
by reference to known genes expressed in the host cell. Where possible, the sequence 
25 may also be modified to avoid predicted hairpin secondary mRNA structures. Other 
useful modifications include the addition of a translational initiation consensus sequence 
at the start of the open reading frame, as described in Kozak, Mol. Cell Biol. , 9:5073- 
5080 (1989). Nucleotide sequences which have been optimized for expression in a given 
host species by elimination of spurious polyadenylation sequences, elimination of 
30 exon/intron splicing signals, elimination of transposon-like repeats and/or optimization of 
GC content in addition to codon optimization are referred to herein as an "expression 
enhanced sequence." 
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" More effective variants of RAD51A or RAD51B coulcl be synthesized through the 
use of in vitro recombination (Zhang, J.-H., G. Dawes, W. P. C. Stemmer. 1997. 
Directed evolution of a fiicosidase from a galactosidase by DNA shuffling and screening. 
Proc. Natl. Acad. Sci. USA 94:4504-4509). For example, the RAD51A and RAD51B 
5 from maize and other species could be recombined using the method of DNA shuffling 
and screened or selected for more effective variants. 

In addition, the native ZmRADSl gene or a modified version of the ZmRADSl 
gene could be further optimized for expression by omitting the predicted signal and pre- 
sequence, replacing the signal sequence with another signal sequence, or replacing the 

10 signal and pre-sequence with another type of targeting or localization sequence. The 
ZmRADSl nuclear localization sequence is located within in the 5 ? end of the coding 
region, preferably the first 40 amino acids of sequence SEQ ID NO: 3 or 7, more 
preferably the first 30 amino acids of SEQ ID NO: 3 or 7, even more preferably the first 
20 amino acids of SEQ ID NO: 3 or 7 or most preferably the first 10 amino acids. The 

15 corresponding polynucleotide sequence would be from nucleotide 53 to 113 of SEQ ID 
NO: 1 or nucleotide 73 to 132 of SEQ ID NO: 5 and fragments thereof. 

Proteins 

The isolated proteins of the present invention comprise a polypeptide having at 
20 least 10 amino acids encoded by any one of the polynucleotides of the present invention 
as discussed more fully, above, or polypeptides which are conservatively modified 
variants thereof. The proteins of the present invention or variants thereof can comprise 
any number of contiguous amino acid residues from a polypeptide of the present 
invention, wherein that number is selected from the group of integers consisting of from 
25 10 to the number of residues in a full-length polypeptide of the present invention. 

Optionally, this subsequence of contiguous amino acids is at least 15, 20, 25, 30, 35, or 
40 amino acids in length, often at least 50, 60, 70, 80, or 90 amino acids in length. 
Further, the number of such subsequences can be any integer selected from the group 
consisting of from 1 to 20, such as 2, 3, 4, or 5. 
30 As those of skill will appreciate, the present invention includes catalytically active 

polypeptides of the present invention (i.e., enzymes). Catalytically active polypeptides 
have a specific activity of at least 20%, 30%, or 40%, and preferably at least 50%, 
60%, or 70%. and most preferably at least 80%, 90%, or 95% that of the native (non- 
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synthetic), endogenous polypeptide. Further, the substrate specificity (k^/KJ is 
optionally substantially similar to the native (non-synthetic) , endogenous polypeptide. 
Typically, the will be at least 30% , 40%, or 50% ? that of the native (non-synthetic), 
endogenous polypeptide; and more preferably at least 60%, 70%, 80%, or 90%. 
5 Methods of assaying and quantifying measures of enzymatic activity and substrate 
specificity (k ul /KJ, are well known to those of skill in the art. 

Generally, the proteins of the present invention will, when presented as an 
immunogen, elicit production of an antibody specifically reactive to a polypeptide of the 
present invention. Further, the proteins of the present invention will not bind to antisera 

10 raised against a polypeptide of the present invention which has been fully immunosorbed 
with the same polypeptide. Immunoassays for determining binding are well known to 
those of skill in the art. A preferred immunoassay is a competitive immunoassay as 
discussed, infra. Thus, the proteins of the present invention can be employed as 
immunogens for constructing antibodies immunoreactive to a protein of the present 

15 invention for such exemplary utilities as immunoassays or protein purification 
techniques. 

Modulating Polypeptide Levels and/or Composition 

The present invention further provides a method for modulating (i.e., increasing 
20 or decreasing) the concentration or composition of the polypeptides of the present 
invention in a plant or pan thereof. Modulation can be effected by increasing or 
decreasing the concentration and/or the composition (i.e., the ratio of the polypeptides of 
the present invention) in a plant. The method comprises introducing into a plant cell 
with an expression cassette comprising a polynucleotide of the present invention as 
25 described above to obtain a transformed plant cell, culturing the transformed plant cell 
under plant cell growing conditions, and inducing or repressing expression of a 
polynucleotide of the present invention in the plant for a time sufficient to modulate 
concentration and/or composition in the plant or plant part. 

In some embodiments, the content and/or composition of polypeptides of the 
30 present invention in a plant may be modulated by altering, in vivo or in vitro, the 

promoter of a gene to up- or down-regulate gene expression. In some embodiments, the 
coding regions of native genes of the present invention can be altered via substitution, 
addition, insertion, or deletion to decrease activity of the encoded enzyme. See, e.g., 
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Kmiec, U.S. Patent 5,565350; Zarling et aL % PCT/US93/03868. And in some 
embodiments, an isolated nucleic acid (e.g., a vector) comprising a promoter sequence is 
transfected into a plant cell. Subsequently, a plant cell comprising the promoter operably 
linked to a polynucleotide of the present invention is selected for by means known to 
5 those of skill in the art such as, but not limited to. Southern blot, DNA sequencing, or 
PCR analysis using primers specific to the promoter and to the gene and detecting 
amplicons produced therefrom. A plant or plant pan altered or modified by the 
foregoing embodiments is grown under plant forming conditions for a time sufficient to 
modulate the concentration and/or composition of polypeptides of the present invention 
10 in the plant. Plant forming conditions are well known in the art and discussed briefly, 
supra. 

In general, concentration or composition is increased or decreased by at least 5%, 
10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% relative to a native control 
plant, plant part, or cell lacking the aforementioned expression cassette. Modulation in 

1 5 the present invention may occur during and/or subsequent to growth of the plant to the 
desired stage of development. Modulating nucleic acid expression temporally and/or in 
particular tissues can be controlled by employing the appropriate promoter operably 
linked to a polynucleotide of the present invention in, for example, sense or antisense 
orientation as discussed in greater detail, supra. Induction of expression of a 

20 polynucleotide of the present invention can also be controlled by exogenous 

administration of an effective amount of inducing compound. Inducible promoters and 
inducing compounds, which activate expression from these promoters, are well known in 
the art. In preferred embodiments, the polypeptides of the present invention are 
modulated in monocots, particularly maize. 

25 

Molecular Markers 

The present invention provides a method of genotyping a plant comprising a 
RAD 51 polynucleotide. Preferably, the plant is a monocot, such as maize or sorghum. 
Genotyping provides a means of distinguishing homologs of a chromosome pair and can 
30 be used to differentiate segregants in a plant population. 

Molecular marker methods can be used for phylogenetic studies, characterizing 
genetic relationships among crop varieties, identifying crosses or somatic hybrids, 
localizing chromosomal segments affecting monogenic traits, map based cloning, and the 
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study of quantitative inheritance. See. e.g.. Plant Molecular Biology: A Laboratory 
Manual, Chapter 7, Clark, Ed., Springer-Verlag, Berlin (1997). For molecular marker 
methods, see generally. The DNA Revolution by Andrew H. Paterson 1996 (Chapter 2) 
in: Genome iMapping in Plants (ed. Andrew H. Paterson) by Academic Press/R. G. 
5 Landis Company, Austin, Texas, pp.7-21. 

The particular method of genotyping in the present invention may employ any 
number of molecular marker analytic techniques such as, but not limited to, restriction 
fragment length polymorphisms (RFLPs). RFLPs are the product of allelic differences 
between DNA restriction fragments caused by nucleotide sequence variability. As is 

10 well known to those of skill in the art. RFLPs are typically detected by extraction of 
genomic DNA and digestion with a restriction enzyme. Generally, the resulting 
fragments are separated according to size and hybridized with a probe; single copy 
probes are preferred. Restriction fragments from homologous chromosomes are 
revealed. Differences in fragment size among alleles represent an RFLP. Thus, the 

15 present invention further provides a means to follow segregation of RAD51 genes of the 
present invention as well as chromosomal sequences genetically linked to RAD51 genes 
using such techniques as RFLP analysis. Linked chromosomal sequences are within 50 
centiMorgans (cM), often within 40 or 30 cM, preferably within 20 or 10 cM, more 
preferably within 5, 3, 2, or 1 cM of a RAD51 gene of the present invention. 

20 In the present invention, the nucleic acid probes employed for molecular marker 

mapping of plant nuclear genomes selectively hybridize, under selective hybridization 
conditions, to a gene encoding &RAD51 polynucleotide. In preferred embodiments, the 
probes are selected from RAD51 polynucleotides. Typically, these probes are cDNA 
probes or Pst I genomic clones. In the present invention probes can be made from the 

25 polynucleotide sequences found in Table III (SEQ ID NO:9), Table IV (SEQ ID NO: 10), 
or SEQ ID NO:ll. The length ofRADSl probes are typically at least 15 bases in length, 
more preferably at least 20, 25, 30, 35, 40, or 50 bases in length. Generally, however, 
the probes are less than about 1 kilobase in length. Preferably, the probes are single 
copy probes that hybridize to a unique locus in a haploid chromosome complement. 

30 Some exemplary restriction enzymes employed in RFLP mapping are EcoRI, EcoRv, 
and Sstl. As used herein the term "restriction enzyme ?? includes reference to a 
composition that recognizes and, alone or in conjunction with another composition, 
cleaves at a specific nucleotide sequence. 
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The method of detecting an RFLP comprises the steps of (a) digesting genomic 
DNA of a plant with a restriction enzyme; (b) hybridizing a nucleic acid probe, under 
selective hybridization conditions, to a RAD51 polynucleotide sequence of said genomic 
DNA; (c) detecting therefrom a RFLP. 
5 Other methods of differentiating polymorphic (allelic) variants of the 

polynucleotides of the present invention can be had by utilizing molecular marker 
techniques well known to those of skill in the art including such techniques as: 1) single 
stranded conformation analysis (SSCP); 2) denaturing gradient gel electrophoresis 
(DGGE); 3) RNase protection assays; 4) alleie-specific oligonucleotides (ASOs); 5) the 

10 use of proteins which recognize nucleotide mismatches, such as the E. coli mutS protein; 
and 6) allele-specific PCR. Other approaches based on the detection of mismatches 
between the two complementary DNA strands include clamped denaturing gel 
electrophoresis (CDGE); heteroduplex analysis (HA); and chemical mismatch cleavage 
(CMC). Thus, the present invention further provides a method of genotyping 

15 comprising the steps of contacting, under stringent hybridization conditions, a sample 
suspected of comprising a RAD51 polynucleotide with a nucleic acid probe. Generally, 
the sample is a plant sample; preferably, a sample suspected of comprising a maize 
RAD51 polynucleotide (e.g., gene, mRNA). The nucleic acid probe selectively 
hybridizes, under stringent conditions, to a subsequence of aRAD51 polynucleotide 

20 comprising a polymorphic marker. Selective hybridization of the nucleic acid probe to 
the polymorphic marker nucleic acid sequence yields a hybridization complex. Detection 
of the hybridization complex indicates the presence of that polymorphic marker in the 
sample. In preferred embodiments, the nucleic acid probe comprises aRAD51 
polynucleotide. 

25 

Expression of Proteins in Host Cells 

Using the nucleic acids of the present invention, one may express a protein of the 
present invention in a recombinantly engineered cell such as bacteria, yeast, insect, 
mammalian, or preferably plant cells. The cells produce the protein in a non-natural 
30 condition (e.g., in quantity, composition, location, and/or time), because they have been 
genetically altered through human intervention to do so. 

It is expected that those of skill in the art are knowledgeable in the numerous 
expression systems available for expression of a nucleic acid encoding a protein of the 



WO 99/41394 PCTAJS 99/02 900 

-2! - 

present invention. No attempt to describe in detail the various methods known for the 
expression of proteins in prokaryotes or eukaryotes will be made. A review of 
expression systems can be found in Recombinant Gene Expression Protocols, Tuan. Ed., 
Humana Press, New Jersey (1997). 
5 In brief summary, the expression of isolated nucleic acids encoding a protein of 

the present invention will typically be achieved by operably linking, for example, the 
DNA or cDNA to a promoter (which is either constitutive, cell and/or tissue specific, or 
inducible), followed by incorporation into an expression vector. The vectors can be 
suitable for replication and integration in either prokaryotes or eukaryotes. Typical 

10 expression vectors contain transcription and translation terminators, initiation sequences, 
and promoters useful for regulation of the expression of the DNA encoding a protein of 
the present invention. To obtain high level expression of a cloned gene, it is desirable to 
construct expression vectors which contain, at the minimum, a strong promoter to direct 
transcription, a ribosome binding site for translational initiation, and a 

15 transcription/translation terminator. One of skill would recognize that modifications can 
be made to a protein of the present invention without diminishing its biological activity. 
Some modifications may be made to facilitate the cloning, expression, or incorporation 
of the targeting molecule into a fusion protein. Such modifications are well known to 
those of skill in the art and include, for example, a methionine added at the amino 

20 terminus to provide an initiation site, or additional amino acids (e.g., poly His) placed on 
either terminus to create conveniently located restriction sites or termination codons or 
purification sequences. 

A. Expression in Prokaryotes 

25 Prokaryotic cells may be used as hosts for expression. Prokaryotes most 

frequently are represented by various strains off. coli; however, other microbial strains 
may also be used. Commonly used prokaryotic control sequences which are defined 
herein to include promoters for transcription initiation, optionally with an operator, along 
with ribosome binding site sequences, include such commonly used promoters as the beta 

30 lactamase (penicillinase) and lactose (lac) promoter systems (Chang et al., Nature 

198:1056 (1977)), the tryptophan (tip) promoter system (Goeddel et al.. Nucleic Acids 
Res. 8:4057 (1980)) and the lambda derived P L promoter and N-gene ribosome binding 
site (Shimatake et a/., Nature 292:128 (1981)). The inclusion of selection markers in 



WO 99/4 1 394 PCT/US99/02900 

- 22 - 

DNA vectors transfected in E. coli is also useful. Examples of such markers include 
genes specifying resistance to ampicillin, tetracycline, or chloramphenicol. 

The vector is selected to allow introduction into the appropriate host cell. 
Bacterial vectors are typically of plasmid or phage origin. Appropriate bacterial cells are 
5 infected with phage vector particles or transfected with naked phage vector DNA. If a 
plasmid vector is used, the bacterial cells are transfected with the plasmid vector DNA. 
Expression systems for expressing a protein of the present invention are available using 
Bacillus sp. and Salmonella (Palva, et aL, Gene 22: 229-235 (1983); Mosbach, et al, 
Nature 302: 543-545 (1983)). 

10 

B. Expression in Eukaryotes 

A variety of eukaryotic expression systems such as yeast, insect cell lines, plant 
and mammalian cells, are known to those of skill in the art. As explained briefly below, 
a protein of the present invention can be expressed in these eukaryotic systems . In some 

15 embodiments, transformed/transfected plant cells, as discussed infra, are employed as 
expression systems for production of the proteins of the instant invention. 

Synthesis of heterologous proteins in yeast is well known. Sherman, F., et al, 
Methods in Yeast Genetics, Cold Spring Harbor Laboratory (1982) is a well recognized 
work describing the various methods available to produce the protein in yeast. Suitable 

20 vectors usually have expression control sequences, such as promoters, including 3- 
phosphoglycerate kinase or other glycolytic enzymes, and an origin of replication, 
termination sequences and the like as desired. For instance, suitable vectors are 
described in the literature (Botstein, et al , Gene 8: 17-24 (1979); Broach, et al , Gene 
8: 121-133 (1979)). 

25 A protein of the present invention, once expressed, can be isolated from yeast by 

lysing the cells and applying standard protein isolation techniques to the lysates. The 
monitoring of the purification process can be accomplished by using Western blot 
techniques or radioimmunoassay of other standard immunoassay techniques. 

The sequences encoding proteins of the present invention can also be ligated to 

30 various expression vectors for use in transfecting cell cultures of, for instance, 

mammalian, insect, or plant origin. Illustrative of cell cultures useful for the production 
of the peptides are mammalian cells. Mammalian cell systems often will be in the form 
of monolayers of cells although mammalian cell suspensions may also be used. A 
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number of suitable host cell lines capable of expressing intact proteins have been 
developed in the art, and include the HEK293, BHK21, and CHO cell lines. Expression 
vectors for these cells can include expression control sequences, such as an origin of 
replication, a promoter (e.g. , the CMV promoter, a HSV tk promoter or pgk 
5 (phosphogly cerate kinase) promoter), an enhancer (Queen et aL, Immunol. Rev. 89: 49 
(1986)), and necessary processing information sites, such as ribosome binding sites, 
RNA splice sites, polyadenylation sites (e.g., an SV40 large T Ag poly A addition site), 
and transcriptional terminator sequences. Other animal cells useful for production of 
proteins of the present invention are available, for instance, from the American Type 

10 Culture Collection Catalogue of Cell Lines and Hybridomas (7th edition, 1992). 

Appropriate vectors for expressing proteins of the present invention in insect cells 
are usually derived from the SF9 baculovirus. Suitable insect cell lines include mosquito 
larvae, silkworm, armyworm, moth and Drosophila cell lines such as a Schneider cell 
line (See Schneider, J. Embryoi Exp. Morphol. 27: 353-365 (1987). 

15 As with yeast, when higher animal or plant host cells are employed, 

polyadenlyation or transcription terminator sequences are typically incorporated into the 
vector. An example of a terminator sequence is the polyadenlyation sequence from the 
bovine growth hormone gene. Sequences for accurate splicing of the transcript may also 
be included. An example of a splicing sequence is the VP1 intron from SV40 (Sprague, 

20 et al. y J. Virol. 45: 773-781 (1983)). Additionally, gene sequences to control replication 
in the host cell may be incorporated into the vector such as those found in bovine 
papilloma virus type-vectors. Saveria-Campo, M., Bovine Papilloma Virus DNA a 
Eukaryotic Cloning Vector in DNA Cloning Vol. II a Practical Approach, DM. Glover, 
Ed., IRL Press, Arlington, Virginia pp. 213-238 (1985). 



25 



Gene Delivery 

An isolated maize RAD51 recombinase gene may be incorporated into a plasmid 
and introduced into a host cell, e.g., a heterologous non-maize cell. Expression of the 
recombinant ZmRADSl protein encoded by one of the nucleotide sequences disclosed 
herein can provide a source of a substantially pure plant recombinase. 

A polynucleotide sequence encoding for ZmRADSIA (SEQ ID NO:2) or 
ZmRADSIB (SEQ ID NO: 6) may be delivered to a host cell such as a plant cell for 
transient transformation or stable integration into the plant's genome by methods known 
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in the - art. Preferably, the host cell is a plant cell and, more preferably, a monocot cell, 
such as a maize cell. To accomplish such delivery, a nucleotide sequence containing the 
coding sequence for ZmRAD51A (SEQ ID NO:2) or the coding sequence for ZmRADS IB 
(SEQ ID NO: 6) may be attached to regulatory elements needed for the expression of the 
5 gene in a particular host cell or system. These regulatory elements include, for example, 
promoters, terminators, and other elements that permit desired expression of the enzyme 
in a particular plant host, in a particular tissue or organ of a host such as vascular tissue, 
root, leaf, or flower, or in response to a particular signal. These regulatory elements 
may also include the native regulatory sequences normally associated with the RAD51 
10 genes in their endogenous state. 

Promoters 

A promoter is a DNA sequence that directs the transcription of a structural gene, 
e.g., that portion of the DNA sequence that is transcribed into messenger RNA (mRNA) 
15 and then translated into a sequence of amino acids characteristic of a specific 

polypeptide. Typically, a promoter is located 5' of the structural gene it controls, 

derepressible), increasing the rate of transcription in response to the presence or absence 
of a resulting agent. In contrast, a promoter may be constitutive, whereby the rate of 

20 transcription is not regulated by a specific agent. A promoter may be regulated in a 
tissue-specific or tissue-preferred manner, such that it is only active in transcribing the 
operably linked coding region in a specific tissue type or types, such as plant leaves, 
roots, or meristem. Examples of suitable promoters which may be operably linked to the 
present ZmRAD51 coding sequences include the maize ubiquitin promoter ubiquitin 

25 (Christensen et al. , Plant Mol Biol , 12:619-632 (1992) and the ZmDJl promoter 
(Baszczynski et al, Maydica, 42:189-201 (1997)). 

Inducible Promoters 

An inducible promoter useful in the present invention may be operably linked to a 
30 nucleotide sequence encoding ZmRADS L Optionally, the inducible promoter is 

operably linked to a nucleotide sequence encoding a signal sequence which is operably 
linked to a nucleotide sequence encoding ZmRADS 1. With an inducible promoter, the 
rate of transcription increases in response to an inducing agent. Any inducible promoter 
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can be used in the present invention to direct transcription of ZmRADSL including those 

described in Ward et al.. Plant Molecular Biol., 22: 36L-366 (1993). Exemplary 

inducible promoters include that from the ACE1 system which responds to copper (Mett 

etal.. Proc. Nat' I Acad. Sci. (U.S.A.) NAS. 90: 4567-4571(1993)); the In 2 gene 
5 promoter from maize which responds to benzenesulfonamide herbicide safeners (Hershey 

and Stoner, Plant MoL Biol., 17:679-690 (1991)); and the Tet repressor from TnlO 

(Hershey, Mol. Gen. Genetics, 227:229-237 (1991)). 

A particularly preferred inducible promoter is one that responds to an inducing 

agent to which plants do not normally respond. One example of such a promoter is the 
10 steroid hormone gene promoter. Transcription of the steroid hormone gene promoter is 

induced by a glucocorticosteroid hormone. (Schena et aL, Proc. Natl Acad. Sci. 

(U.S.A.)., 88:10421 (1991)). 

The present invention also provides an expression vector having an inducible 

promoter operably linked to a nucleotide sequence encoding ZmRADSl . The expression 
1 5 vector may be introduced into plant cells and the cells exposed to an inducer of the 

inducible promoter. The cells may then be screened for the presence of ZmRADSl 

protein by immunoassay methods. 

Tissue-specific or Tissue-Preferred Promoters 

20 An expression vector of the present invention may include a tissue-specific or 

tissue-preferred promoter operably linked to the nucleotide sequence encoding 
ZmRADSl. The expression vector is introduced into plant cells. The cells may be 
screened for the presence of ZmRADSl protein, e.g., by immunological methods. 
Optionally, the tissue-preferred promoter is operably linked to a nucleotide 

25 sequence encoding a signal sequence which is operably linked to a nucleotide sequence 
encoding ZmRADSl. Plants transformed with a gene encoding ZmRADSl operably 
linked to a tissue specific promoter produce ZmRADSl protein at least preferentially and, 
preferably, exclusively (" tissue-specific promoter") in a specific tissue. 

Any tissue-specific or tissue -preferred promoter can be utilized in the instant 

30 invention. Examples of such promoters include a root-preferred promoter such as that 
from the phaseolin gene as described in Sengupta-Gopalan et al., Proc. Natl Acad. Sci. 
(U.S.A.), 82:3320-3324 (1985) or the TobRB7 gene characterized by Yamamoto et al, 
Plant Cell. 3:371-382 (1991); a leaf-specific and light-induced promoter such as that 
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from cab or rubisco as described in Simpson et aL , EMBO J. , 4(1 l):2723-2729 (1985); 
an anther-specific promoter such as that from LAT52 as described in Twell et aL, MoL 
Gen. Genet., 217:240-245 (1989): a pollen-specific promoter such as that fromZml3 as 
described in Guerrero et aL, MoL Gen. Genet., 224:161-168 (1993): and a microspore- 
5 preferred promoter such as that from apg as described in Twell et aL , Sex. Plant 
Reprod., 6:217-224 (1993). 

Other tissue-specific promoters useful in the present invention include a phloem- 
preferred promoter such as that associated with the Arabidopsis sucrose synthase gene as 
described in Martin et aL, 1993, The Plant Journal 4:367-377; a floral-specific 
10 promoter such as that of the Arabidopsis HSP 18.2 gene described in Tsukaya et aL, 

MoL Gen. Genet., 237:26-32 (1993) and of the Arabidopsis HMG2 gene as described in 
Enjuto et aL, Plant Cell, 7:517-527 (1995). 

Constitutive Promoters 

15 Alternatively, the nucleotide sequence encoding ZmRADSl may be operably 

linked to a constitutive promoter. Optionally, the constitutive promoter is operably 
linked to a nucleotide sequence encoding a signal sequence which is operably linked to a 
nucleotide sequence encoding ZmRADSl . Many different constitutive promoters can be 
utilized in the instant invention to express ZmRADSl. Examples include promoters from 

20 plant viruses such as the 35S promoter from cauliflower mosaic virus (CaMV), as 

described in Odell et aL, Nature, 313:810-812 (1985), and promoters from genes such as 
rice actin (McElroy et aL, Plant Cell, 2:163-171 (1990)); ubiquitin (Christensen aL, 
Plant MoL BioL, 12:619-632 (1992); pEMU (Last etaL, Theor. Appl. Genet., 81:581- 
588 (1991)); MAS (Velten et aL, EMBO J., 3:2723-2730 (1984)); and maize H3 histone 

25 (Lepetit et aL , MoL Gen. Genet. , 23 1 :276-285 (1992)). 

Additional Regulatory Elements 

Additional regulatory elements that may be connected to the ZmRADSl nucleic 
acid sequence for expression in plant cells include terminators, polyadenylation 
30 sequences, and nucleic acid sequences encoding signal peptides that permit localization 
within a plant cell or secretion of the protein from the cell. Such regulatory elements 
and methods for adding or exchanging these elements with the regulatory elements of the 
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ZmRADSl gene are known, and include, but are not limited to, 3' termination and/or 
polyadenylation regions such as those of the Agrobacterium tumefaciens nopaline 
synthase (nos) gene (Bevan et aL, NucL Acids Res. , 12:369-385 (1983)); the potato 
proteinase inhibitor II (Pinll) gene (Keil et aL. NucL Acids Res. , 14:5641-5650 (1986)); 
5 and the CaMV 19S gene (Mogen et aL. Plant Cell 2:1261-1272 (1990)). 

Plant signal sequences, including, but not limited to, signal-peptide encoding 
DNA/RNA sequences which target proteins to the extracellular matrix of the plant cell 
(Dratewka-Kos et aL, 7. Biol. Chem., 264:4896-4900 (1989)) and the Nicotiana 
plumbaginifolia extensin gene (DeLoose et aL, Gene, 99:95-100 (1991)) ? or signal 

10 peptides which target proteins to the vacuole like the sweet potato sporamin gene 

(Matsuka et aL, Proc. Nat'l Acad. Sci. (U.S.A.), 88:834 (1991)) and the barley lectin 
gene (Wilkins et aL, Plant Cell 2:301-313 (1990)), or signals which cause proteins to be 
secreted such as that of PRIb (Lind et aL , Plant Mol. Biol , 18:47-53 (1992)), or those 
which target proteins to the plastids such as that of rapeseed enoyl-ACP reductase 

15 (Verwaert et aL, Plant MoL Biol. , 26:189-202 (1994)) are useful in the invention. 

Another regulatory element that may be employed in combination with the 
ZmRADSl nucleic acid sequence for expression in plant cells is a nuclear localization 
sequence ("NLS") which directs localization of expression of the ZmRADSl protein to 
the nucleus of a plant cell. Examples of suitable nuclear localization sequences may be 

20 found in Kalderon et aL, Cell, 39:499-509 (1984) and Hicks et aL, Plant Cell, 5:983- 
994 (1993). Alternatively, the native ZmRADSl nuclear localization signal located in the 
5' region of the coding sequence, most preferably from nucleotide 53 to 113 of SEQ ID 
NO:l or nucleotide 73 to 132 of SEQ ID NO:5 could be used. 



25 Gene Delivery Methods 

Numerous methods for introducing foreign genes into plant cells are known and 
can be used to insert a ZmRADSl gene into a plant host cell, including biological and 
physical DNA delivery protocols. See, for example, Miki et aL, "Procedure for 
Introducing Foreign DNA into Plants" f in: Methods in Plant Molecular Biology and 
30 Biotechnology, Glick and Thompson, eds., CRC Press, Inc., Boca Raton, pages 67-88 
(1993). The methods chosen vary with the host plant, and include chemical transfection 
methods such as calcium phosphate, microorganism-mediated gene transfer such as 
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Agrobacterium (Horsch et al., Science, 227:1229-31 (1985)), eiectroporation, micro- 
injection, and biolistic bombardment. 

Expression cassettes and vectors and in vitro culture methods for plant cell or 
tissue transformation and regeneration of plants are also known and available. See, for 
5 example, Gruber et al., "Vectors for Plant Transformation," in: Methods in Plant 
Molecular Biology and Biotechnology, Glick and Thompson, eds., CRC Press, Inc., 
Boca Raton, pages 89-119 (1993). As used herein, an "expression cassette* 7 is a nucleic 
acid construct, generated recombinant^ or synthetically, with a series of specified 
nucleic acid elements which permit transcription of a particular nucleic acid in a host 
10 cell. The expression cassette can be incorporated into a plasmid, chromosome, 
mitochondrial DNA, plastid DNA, virus, or nucleic acid fragment. Typically, the 
expression cassette portion of an expression vector includes, among other sequences, a 
nucleic acid to be transcribed, and a promoter. 



1 5 Agrobacterium-medizted Gene Delivery 

One widely utilized method for introducing an expression vector into plants is 
based on the natural transformation system of Agrobacterium. <4. tumefaciens and .4. 
rhizogenes are plant pathogenic soil bacteria that genetically transform plant cells. The 
Ti and Ri plasmids of A. tumefaciens and A. rhizogenes, respectfully, carry genes 
20 responsible for genetic transformation of plants (see, e.g., Kado, Crit. Rev. Plant Sri., 
10:1 (1991)). Descriptions of the Agrobacterium vector systems and methods for 
Agrobacterium-medmted gene transfer are provided in Gruber et al. , supra, see also 
Hiei, et ai, U. S. patent no. 5,591,616, issued January 7, 1997. 



25 Direct Gene Transfer 

Despite the fact that the host range for Agrobacterium-medmied transformation is 
broad, some major cereal crop species and gymnosperms have generally been 
recalcitrant to this mode of gene delivery, even though some success has recently been 
achieved in rice and maize (Hiei et a/.. The Plant Journal, 6:271-282 (1994); Ishida et 
30 al. Nature Biotechnology, 14:745-750 (1996), Hiei, et al. , U. S. patent no. 5,591,616, 
issued January 7, 1997). Several other methods of introducing foreign DNA into plant 
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cells/ collectively referred to as direct gene transfer, have been developed as an 

alternative to Agrobacterium-mediaied delivery. 

A generally applicable method of delivering DNA into plant cells is 

microprojectile-mediated delivery, where DNA is carried on the surface of 
5 microprojectiles measuring about 1 to 4 jiiM in diameter. The expression vector is 

introduced into plant tissues with a biolistic device that accelerates the microprojectiles to 

speeds of 300 to 600 m/s which is sufficient to penetrate the plant cell walls and 

membranes, (e.g., Klein era/.. Biotechnology. 10:268 (1992)). 

Another method for physical delivery of DNA to plants is sonication of target 
10 cells as described in Zang et al, Bio/Technology, 9:996 (1991). Alternatively, 

liposome or spheroplast fusions have been used to introduce expression vectors into 

plants (see, e.g.. Christou etal, Proc. Nat'lAcad. Sci. (U.S.A.), 84:3962 (1987)). 

Direct uptake of DNA into protoplasts using CaCl 2 precipitation , polyvinyl alcohol or 

poly-L-ornithine have also been reported. (See, for example, Hain et al, Moi 
15 Gen. Genet., 199:161 (1985)). Electroporation of protoplasts and whole cells and tissues 

has also been described (see, for example. Spencer et al. Plant Moi Biol. , 24:51-61 

(1994)). 

Particle Wounding/ Agrobacterium Delivery 

20 Another useful basic transformation protocol involves a combination of wounding 

by particle bombardment, followed by use of Agrobacterium for DNA delivery, as 
described by Bidney et al, Plant Moi Biol , 18:301-313 (1992). Useful plasmids for 
this delivery method are ones containing a Bin 19 backbone (see Bevan, Nucleic Acids 
Research, 12:8711-8721 (1984)). This method is commonly used to deliver heterologous 

25 DNA into sunflower cells. 

Assay Methods 

Transgenic plant cells, callus, tissues, shoots, and transgenic plants may be tested 
for a presence of the ZmRADSl gene by DNA analysis and for expression of the gene by 
30 immunoassay. For example, the presence of aZm&iD51 gene can be confirmed by 
Southern analysis. This common procedure may be carried out by isolating DNA from 
the cells in question, cutting the DNA using restriction enzymes, fractionating the 
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resulting DNA fragments on an agarose gel to separate the fragments by molecular 
weight. The separated DNA fragments are then transferred to nitro-cellulose membranes 
and hybridized with a radioactively labeled probe fragment (e.g., labeled with 32 P) and 
washed with an SDS solution (see, e.g., Southern, 7. Molec. Biol. y 98:503-517 (1975)). 
5 Alternatively, the presence of the ZmRAD51 gene in transgenic cells may be verified by 
amplifying the gene (or a portion of the gene) by polymerase chain reaction ( U PCFT) 
using appropriate primers, cutting the DNA from the PCR with restriction enzymes, and 
fractionating the resulting DNA fragments as described above and detecting the PCR 
amplified DNA fragment with an appropriate probe (see, e.g., Saiki, Science, 239:487- 
10 491 (1988)). 

Instead of examining the transgenic cells for the presence of iht ZmRADSl gene, 
expression of the gene by the transgenic cell may be probed using an immunoassay 
technique to establish the presence of expressed ZmRADSl protein. This is typically 
carried out by probing the protein fraction from the transgenic cells with an antibody 
15 specific for the ZmRADSl protein. The presence of the resulting ZmRADSl 

protein/antibody complex can be detected using a variety of well known techniques. 

The invention is further characterized by the following examples. These 
examples are not meant to limit the scope of the invention as set forth in the foregoing 
description and variations within the concepts of the invention will be apparent. 

20 

EXAMPLES 

Example 1 - Cloning of ZmRADSl A & ZmRADSIB cDNA 
A. Recovery of a ZmRADSl probe fragment: 

25 Poly-A mRNA from maize (cv. A632) tassels was prepared using the 

MicroQuick kit (Pharmacia, Piscataway, NJ). Room temperature PCR was performed 
on the mRNA using a set of degenerate primers designed from known RAD51 consensus 
sequences. The PCR amplified fragment was cloned and sequenced, and confirmed to 
be a 360 bp cDNA sequence for a RAD51 homolog. The probe fragment clone 

30 corresponding to the Zea mays cDNA was designated PHP7763 and had the following 
sequence: 
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ACATTCAGACCACAAAGGCTCTTGCAGATTGCTGACAGGTTTGGACTGAATGG 
TGCTGATGTGTTAGAGAATGTGGCTTATGCCAGAGCTTATAATACGGATCATC 
AATCTAGACTTCTGCTGGAAGCAGCTTCCATGATGATAGAGACCAGGTTTACT 
CTTATGGTTGTAGACAGTGCCACAGCTCTGTACAGAACTGATTTCTCAGGAAG 
5 AGGGGAACTATCAGCGAGGCAAATGCACATGGCTAAGTTCCTGAGGAGCCTT 
CAGAAGTTAGCTGATGAGTTTGGAGTAGCTGTGGTTATCACCAATCAAGTAGT 
GGCCCAAGTGGATGGATCTACTATGTTTGCTGGGCCGCAGTTC (SEQ ID 
NO.ll). 

The PHP7763 probe sequence was used to design and synthesize two maize 
10 sequence-specific oligonucleotide primers, PHN7443 having the nucleotide sequence 5'- 
TATAGAATTCCACAAAGGCTCTTGCAGATTGCTGACAG (SEQ ID NO: 12) and 
PHN7444 having the sequence, 

5 ' -AT ACTCG AGGCCC AGC AAAC AT AGT AG ATCC ATCC AC (SEQ ID NO: 13). 

15 B. Lambda library screening. 

A lambda cDNA library made by Stratagene (LaJolla, CA) from supplied maize 
(cv. A632) tassel RNA was screened using standard procedures as described in 
Molecular Cloning: A Laboratory Manual, Second Edition, Sambrook, Fritsch and 
Maniatis, Cold Spring Harbor Laboratory Press, (1989). The library was plated at 

20 approximately 50,000 plaques per 150 mm plate, transferred to filters (Magnalift brand, 
MSI, Inc. Westborough, MA), and screened using a digoxigenin-dUTP-labeled PCR 
amplified probe generated using oligonucleotide primers PHN7443 and PHN7444. 
Labeling and hybridization conditions were as described in the Genius™ System manual 
by the manufacturer (Boerhinger Mannheim Corp., Indianapolis, IN), with 

25 modifications. The full protocols used are listed in Examples 3 and 4. 

The filter probing resulted in 32 spots which aligned well on the lumigraphs from 
duplicate lifts, 11 more with less optimal alignment, 8 with fair alignment and 23 with no 
corresponding alignment on the duplicate lift. The corresponding plaques were picked 
for further evaluation. Additional PCR amplification reactions designed to eliminate 

30 false positives were carried out using the PHN7443 and PHN7444 primers above plus 
M13 forward (PHN162, S'-TCCCAGTCACGACGTTGTAAAACG SEQ ID NO:i4); 
and reverse (PHN487, 5'-AGCGGATAACAATTTCACACAGGAAACAGCTATGAC 
SEQ ID NO: 15) primers. Six plaque picks were recovered, titered, replated to give 



WO 99/41394 PCT/US99/02900 

-32 - 

several hundred plaques per plate and lifted and reprobed as described above. 
Confirmed positive plaques were processed through an in vivo "pop-out" protocol that 
generates phagemid (plasmid) DNA from infecting lambda phage. The protocol used 
was developed at Stratagene (La Jolla, CA) and is described in the directions that 
5 accompany their Lamda Zap libraries. When successfully performed, this procedure 
yields E. coli colonies that contain a pBlueScript™ (Stratagene, La Jolla, CA) type 
plasmid that has been excised from the lambda phage. Each plaque pick used yielded 
several colonies following this protocol. Additional screening of the plasmid clones 
resulted in three unique clones, as determined by restriction enzyme analysis. Partial 

1 0 sequencing and comparison to known higher eukaryotic RAD51 genes confirmed that 
these clones corresponded to maize homologs of RAD51. Of the three clones, two were 
identical in sequence except that one of the two had a longer 3' untranslated region and 
was truncated in the coding region. The third clone shared very high identity in the 
coding region with the first two clones, but differed in the untranslated regions. The 

15 three clones were designated PHP7981, PHP7982 and PHP7983. PHP7981 

(ZmRADSIA) and PHP7983 (ZmRADSIB) were completely sequenced (see Tables I and 
II herein). The PAD51A 3' untranslated region can be seen in SEQ ID NO:4 (nucleotide 
1078 to 1538) and the RAD51B 3' untranslated region can be seen in SEQ ID NO:8 
(nucleotide 1099 to 1556). 

20 In order to facilitate later cloning of the ZmRADSl genes, site -directed 

mutagenesis (by the method of Su et aL, Gene, 69:81-89 (1988)) was used to introduce 
restriction sites flanking the coding sequences. A Hpal restriction site was introduced 
downstream of the stop codon of PHP7981 (position shown in Table I) using the 
oligonucleotide primer PHN9611 

25 (5'- 

GTATTGCAGATGTTAAGGATTGAGACCATACCTGGTTAACAGGCATCTCAG 3' 
- SEQ ID NO:16) to create the plasmid PHP8057, ABamHl site was introduced 5* to 
the start codon of PHP7983 (position shown in Table II) with the primer PHN9612 (5'- 
GCAGCCAGGGATCCAC-ATGTCCTCGTC3' - SEQ ID NO: 17), and a Hpal site was 
30 inserted 3' to the stop codon (position shown in Table II) with oligonucleotide PHN9613 
(5' - 

CTGATGTCAAGGACTGAAAGCATCCTCATTTGCAGTTAACAGCATAACTTGC 
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3' - SEQ ID NO: 18) to create the plasmid PHP8058. These newly created clones served 
as sources of probes for mapping studies. 

Example 2 - Mapping of the Maize ZmRADSl Clones 

5 For ZmRADSl sequence-specific hybridization, oligonucleotide primers 

homologous to unique sequences in the 3 ? untranslated regions of ZmRADSl A 
(PHN10664; 5 , -CCATACCTGCTTTACAGGCATC3'- SEQ ID NO: 19) or of 
ZmRADSIB (PHN10665; 5 '-CATCCTCATTTGSAGTCCACAG3' - SEQ ID NO:20; 
where "S" denotes a mixture of "C" and "G") were synthesized and used in 

10 conjunction with an M13 universal sequencing primer (PHN162; 5'- 

TCCCAGTCACGACGTTGTAAAACG3' SEQ ID NO: 14) to PCR amplify probe 
fragments from the two vectors PHP8057 (ZmRADSl A) and PHP8058 (ZmRADSIB) . 
Sequences of PHN10664 and PHN10665 span the regions mutagenized to create the 
Hpal sites, but themselves correspond to the sequences of the original clones in 

15 PHP7981 and PHP7983. There was enough identity to permit efficient PCR 

amplification using PHP8057 and PHP8058 as templates. This approach was used in 
order to generate final probe fragments identical to the original ZmRADSIA and 
ZmRADSIB genes. These fragments, which extend from just downstream of the 
translation stop codon to the end of the poly (A) tail of the cDNA sequences, were 

20 subsequently used as probes against two maize populations and map positions were 
determined. 

Southern hybridizations were carried out using two different maize populations 
generated as part of a breeding program. Population 1 (MARS A), an F4, was generated 
from crosses of the lines R03 x N46, and included 200 individuals as part of the mapping 

25 family. Population 2 (ALEB9), an F2, was generated from crosses of the lines R67 x 
P38 and contained 240 individuals. DNA was isolated from each individual by a CTAB 
extraction method (Saghai-Maroof et al, Proc. Natl Acad. Sci. (U.S.A.), 81:8014-8018 
(1994)) and digested individually with restriction enzymes BamUl, HincKlI, EcoRi and 
EcoKV. Digests were separated on agarose gels and transferred to membranes 

30 (Southern, /. Molec. Biol.. 98:503-517 (1975)) prior to hybridization (Helentjaris et al, 
Plant Mol Biol., 5:109-118 (1985)) with an array of probes to establish the basic RFLP 
map. Population 1 membranes were hybridized using 179 RFLP probes, while 
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population 2 membranes were hybridized using 115 RFLP probes. After hybridization 
the membranes were exposed to x-ray film for an appropriate length of time to be 
visually scored. All data were entered into an electronic database and map positions of 
the RFLP probes (Evola et aL, Theor. Appl. Genet. , 71:765-771(1986)) were determined 
5 using MAPMAKER (Lincoln el ai , in Constructing Genetic Linkage Maps with 

MAPMARKER/EXP Version 3.0: A Tutorial and Reference Manual, Whitehead Institute 
for Biomedical Research, Cambridge, MA (1993)) and a map was constructed for each 
population. Table VI lists the positions of a number of markers, including the 
ZmRADSIA gene, mapped on the MARSA population. Table VII lists the positions of a 
10 number of markers, including the ZmRADSIB gene, mapped on the ALEB9 population. 

Example 3 - Hybridization Procedure 

A prehybridization solution was prepared containing the following components: 
1 % BMB Blocking reagent 
15 1% gelatin 

0.2% SDS 

0.1% Sarkosyl (n-lauryl sarkosine) 

5xSSC (750 mM sodium chloride, 75 mM sodium citrate, pH 7.0) 
The solution was heated to facilitate the dissolution of the blocking reagent and the 
20 gelatin. After being wet with 2xSSC, filters were placed in the prehybridization solution 
and incubated at 68 °C for 2 hrs with shaking. 

Hybridization was carried out by a procedure which included denaturing a 
Digoxigenin labeled probe by boiling for 10 minutes and then plunging into an ice water 
bath. The probe was added to give 10 to 20 ng of probe per ml of prehybridization 
25 solution and mixed well to form a hybridization solution. If the hybridization solution 
was to be reused, it was heated to 95 °C for 10 minutes. The equilibrated filters were 
incubated overnight at 68 °C, with gentle shaking or other form of agitation. 

The incubated filters were washed for 5 minutes in a dish with 2xSSC +0,1% 
SDS. This wash was repeated two additional times. The filters were then washed two 
30 times for 1.5 hrs in 0.5xSSC + 0.1% SDS at 60-65 C using pre-warmed wash solution. 



A. Antibody Probing 



Example 4 - Labeling Procedure 
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- Antibody probing was conducted by performing the following steps with 
individual filters in Petri dishes. A Genius™ blocking solution was prepared by 
dissolving 1% BMB Block. 1% Gelatin and 0.5% Tween 20 in Genius™ 1 buffer. The 
Genius™ 1 buffer was heated to dissolve the blocking reagent and gelatin, then cooled to 
room temperature. 

Filters were washed 3 times in Genius™ 1 Buffer (5 minutes per wash) and then 
incubated 1 hr in Genius™ Blocking solution, with gentle rocking. An anti- 
digoxigenin/alkaline phosphate conjugate was diluted 1:100 in leftover block (directly 
into the blocking solution on the filters). After incubation for 0.5 hour, the filters were 
washed 2 times for 15 minutes in Genius™ 1 Buffer. 

B. Chemiluminescent read-out 

Without being allowed to dry, filters were washed 2 times in Genius™ Buffer 3 
(15 minutes per wash). At the start of the first wash. LumiPhos530 was taken out of the 
refrigerator. X-Ray cassettes were prepared by placing transparency sheets (3M AF4300 
sheets) in the cassettes and taping them in place. About 6 mL of LumiPhos530 was 
placed into a Petri dish lid. taking care to handle the LumiPhos aseptically. Filters were 
removed from the Genius™ Buffer 3 and most of the buffer was wicked onto Whatman 
3mm paper by just touching the filter's edge to the Whatman paper. The filter was then 
placed plaque side down onto the LumiPhos530, ensuring good coverage on the plaque 
side and allowing most of the LumiPhos to drain off back into the lid. The filter was 
then placed in the X-ray cassette, plaque side up. on the transparency sheet(s). As soon 
as a transparency sheet was full, another sheet was placed on top of it and bubbles were 
smoothed out. The top sheet was taped in place. Putting the top sheet in place quickly 
prevented the filters from drying out. The filters in the transparency sheet were exposed 
to a Kodak XAR-5 film sheet for 1 to 1.5 hrs. This first film was developed and another 
XAR-5 sheet was exposed overnight. The overnight sheet was very overexposed, but the 
stab marks showed as white spots, allowing ready alignment of the plate to the 
lumigraph. 



C. Picking plaques 

Tubes were prepared with SM buffer (100 mM NaCl, 8 mM MgS0 4 -7H 2 0, 50 
mM Tris-Cl. pH 7.5. 0.01% (w/v) gelatin), usually about 1 mL. Stab marks on the 
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plates were aligned to the marks on the lumigraph. Plaques of interest were picked by 
poking a pipette into the agar and remove a plug containing the plaque using either the 
back of a 5 mL pipette (for 1° picks) or a transfer pipette (for 2° picks). Each plug was 
added to a tube containing the SM buffer. One drop of chloroform (free of isoamyl 
5 alcohol) was then added to the buffer. After capping and vortexing well, the tube was 
then incubated at room temperature for 1 to 2 hours, and then stored at 4°C. 

Example 5 - Creation of ZmRADSl - Containing Plant Transformation Vectors 

Constructs for plant transformation experiments were created in which the 

10 ZmRADSl A or ZmRADSIB genes were inserted behind a maize ubiquitin promoter 

(Christensen et aL, Plant Mol. Biol, 18:1185-1187 (1992)). To facilitate cloning the two 
ZmRAD51 genes as BamRll Hpa\ fragments, aBamRl site was created 5' to the start of 
translation of ZmRADSl in PHP8057 by PCR. The PCR-modified ZmRADSl A and the 
ZmRADSIB genes from PHP8057 and PHP8058 then were inserted as BamHVHpal 

15 fragments downstream of the 2.0 kb Pstl fragment of the maize ubiquitin promoter and 
upstream of the potato proteinase inhibitor II (Pinll) terminator (bases 2 to 310 from An 
et aL, Plant Ceil, 1:115-122 (1989)) in a pUCI9 piasmid backbone to make PHP8060 
(Figure 1) and PHP8103 (Figure 2). 

Another set of constructs was made using either the ubiquitin promoter or the 

20 maize ZmDJl promoter (Baszczynski et aL, Maydica, 42:189-201 (1997)), but where 
the complete ZmRADSl A or ZmRADSIB genes were first fused to the 3' end of a green 
fluorescence protein ("GFP"; Chalfie et aL, Science, 263:802-805 (1994)) gene that was 
previously synthesized so as to encode maize-preferred codons (new gene designated 
" GFPnT) as described in PCT Patent Application No. PCT/US97/07688. To construct 

25 the protein fusions, the GFPm stop codon was removed and a BgUI site was generated by 
site-directed mutagenesis. This 0.8 kb BamHVBglll fragment of the GFPm coding 
sequence was inserted into the BamUl site 5' to the start of ZmRADSl in PHP8060 and 
PHP8103 from above to create PHP8744 (Figure 3) and PHP8745, respectively. This 
process created fusions of GFPm to ZmRADSIA or ZmRADSIB joined by a 6 bp linker 

30 encoding isoleucine and histidine (junctions shown below). 



Bgai/BamUl 
GFPm end | ZmRADSIA start 
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I ! I 

PHP8744 GAC GAG CTC TAG A AG ate cac ATG TCG TCG GCG GCG CAG 
DELYKIHMSS A A 

5 GFPmend | ZmRADSIB start 

I I I 

PHP8743 GAC GAG CTC TAC AAG ate cac ATG TCC TCG TCT TCG GCG 

DELYKIHMSSSSA 

10 To create versions of these constructs utilizing the maize ZmdJl promoter, the 

1.8 kb BamEl/Hpa\ fragments containing GF?m/ZmRAD51A and GF?m/ZmMD51B 
coding sequences from PHP8744 and PHP8745 were inserted downstream of the 0.8 kb 
Sacl/Bglll ZmDJl promoter sequence and upstream of the Pinll terminator in a 
pBlueScript™ plasmid backbone to generate PHP8961 and PHP8962. 

15 

Example 6 - Introduction of ZmRADSl Gene Constructs into Maize Cells and 

Detection of Gene Expression 

20 The various constructs described in Example 5 were introduced into cells of the 

Black Mexican Sweet (" BMS") maize line (Sheridan, Cell Biol, 67:396a (1975)) by 
panicle gun bombardment using 1 ug of plasmid per panicle preparation at 6 shots per 
preparation. About 100 mg of BMS cells per plate were shot. For experiments utilizing 
PHP8060 or PHP8103, another construct, PHP9053, which carried a fusion between a 

25 nuclear localization sequence (NLS), the GFPm gene as above and a maize acetolactate 
synthase (ALS) gene (Fang etal, Plant Moi Biol. 18:1185-1187 (1992)), all driven 
from the ubiquitin promoter was shot concurrently. To create PHP9053, the nuclear 
localization signal from simian virus 40 (SV40) (Kalderoner aL, Cell 39:499-509 
(1984)) was synthesized as a BamWNcol fragment and inserted ziBamKl and AflSL 

30 sites between the ubiquitin promoter and the start codon of GFPm. In order to enhance 
retention of the protein in the nucleus, the molecular weight of NLS/GFPm and hence 
the size of the protein was increased by making a carboxy terminal fusion with a large 
unrelated protein, in this case the maize ALS 2ene. The ALS ceding sequence was 
inserted in frame at the GFPm 3' Bgtll site and blunt-end ligated to the Pinll terminator. 
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Cells were viewed at 24-48 hours post-bombardment for GFP expression using a 
microscope equipped with epi-fluorescence and a FITC filter set. 

In all cases, GFP expression was noted in the nucleus. At no time was GFP 
fluorescence noted in the cytoplasm, either when GPF was part of a fusion that included 
5 the NLS, or as a fusion only with ZmRADSl A or ZmRADSlB. The data obtained with 
PHP8744, PHP8745, PHP8961 and PHP8962 indicate that the expressed RAD51A or 
RAD51B proteins (in this case as fusions with GFP) localize to the nucleus in the 
absence of an exogenously added sequence known to facilitate nuclear localization (i.e., 
the SV40 NLS sequence). Comparable localization results were obtained using two 

10 independent promoters (maize ubiquitin or the ZmDJl promoters) indicating the 

information for nuclear localization is located within the ZmRADSl coding sequences. 
With constructs containing GFP alone, expression does not localize to the nucleus. The 
ZmRADSl nuclear localization sequence is located within in the 5' end of the coding 
region, preferably the first 40 amino acids of sequence SEQ ID NO: 3 or 7, more 

15 preferably the first 30 amino acids of SEQ ID NO: 3 or 7, even more preferably the first 
20 amino acids of SEQ ID NO: 3 or 7 or most preferably the first 10 ami no acids. The 
corresponding polynucleotide sequence would be from nucleotide 53 to 113 of SEQ ID 
NO:l or nucleotide 73 to 132 of SEQ ID NO:5 and fragments thereof. 

As such, the methods and constructs disclosed provide a means of introducing 

20 maize RAD51 genes, or fusions of other genes with the maize RAD51 genes into maize 
cells and maize nuclei, stably expressing the gene products under constitutive or 
inducible control and studying the role of these genes in plant cells. 

Example 7 - Expression of ZmRADSl Genes in an E. coli Host Cell 

25 E. coli expression vectors PHP9011 and PHP9012 were constructed by insertion 

of BamEl/Hpal fragments containing the ZmRADSl A and ZmRADSlB genes from 
PHP8057 and PHP3058, respectively, into the BamUl and Hindi sites in a pET32c 
plasmid (Novagen, Inc., Madison, WI). The resulting plasmids consisted of the T7 
promoter driving expression of a protein containing a 108 amino acid thioredoxin tag, a 

30 6 amino acid histidine tag, a thrombin cleavage site, a 15 amino acid S tag (Novagen 
Inc., Madison, WI), and an enterokinase cleavage site fused to the amino terminus of 
one of the full length ZmRADSl coding sequences. 
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The two constructs PHP9011 and PHP9012 were each transformed into the 
Novagen pET E. coli host strain AD494(DE3)pLysS, and used subsequently to express 
and purify ZmRADSIA and ZmRAD51B protein, respectively, according to the 
following procedure. 

5 E. coli cells transformed with either the PHP9011 or the PHP9012 expression 

vector were incubated in 2YT media (Life Technologies, Gaithersburg, MD) containing, 
100 ug/ml carbenicillin, and 34 ug/ml chloramphenicol. Cells were induced at 
approximately OD600 = 0.8 with 0.2 mM IPTG (isopropyl -l-thio-P-D-galactoside) and 
incubated at room temperature for 3 hours. 

10 The cells were then lysed in a lysis buffer containing 50 mM Tris-HCl (pH 8.0), 

500 mM NaCl 5 mM 2-mercaptoethanol, 1 mM PMSF (phenylmethylsulfonylfluoride), 
0.1% Triton X-100, 10% Glycerol and 5 mM imidazole. 

Purification of the cell lysate was carried out at 4°C on a TALON metal affinity 
column equilibrated with a solution containing 50 mM Tris-HCl pH 8.0, 500 mM NaCl, 

15 0.1% Triton X-100, 10% Glycerol and 5 mM imidazole ("equilibrium buffer"). Lysate 
was loaded onto the equilibrated TALON column. The loaded column was washed with 
the equilibrium buffer and then washed with a solution of 50 mM Tris-HCl (pH 8.0), 
10% Glycerol and 5mM imidazole. The washed column was then eluted with a solution 
containing 50 mM Tris-HCl (pH 8.0), 10% Glycerol and 100 mM Imidazole. 1 mM 

20 DTT and 1 mM EDTA were added to eluted protein, which was then stored at 4°C. 
The expressed fusion proteins were processed and purified as follows. The expressed 
fusion proteins were first dialyzed to remove imidazole and the dialyzed fusion proteins 
were site-specifically cleaved with enterokinase to remove the thioredoxin tag. The 
cleavage products were purified on the Talon column to remove the cleaved tag 

25 fragment. Yields of protein were 3 mg/L for ZmRADSIA and 1.5 mg/L for ZmRADSlB. 

All publications and patent applications in this specification are indicative of the 
level of ordinary skill in the art to which this invention pertains. All publications and 
patent applications are herein incorporated by reference to the same extent as if each 
individual publication or patent application was specifically and individually indicated by 

30 reference. 

The invention has been described with reference to various specific and preferred 
embodiments and techniques. However, it should be understood that many variations 
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and modifications may be made while remaining within the spirit and scope of the 
invention. 
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TABLE I 

FULL LENGTH cDNA AND 
CORRESPONDING AMINO ACID SEQUENCE FOR ZmRADSIA 

LOCUS ZMRADSIA (Sequence corresponding to cDNA insert in PHP7981) 

FEATURES peptide from 53 to 1072 
ORIGFN lea mays A632 line 

1 GGCACGAGTTCGAACAGGGGCAGAGGTGAGACTTGAGAGAAGGAAGAAGGTCAXGTCGTC 

M S S 

61 GGCGGCGCAGCAGCAGCAGAAAGCGGCGGCAGCGGAGCAGGAGGAGGTGGAGCACGGGCC 

AAQQQQKAAAAEQEEVEHGP 
121 ATTCCCCATCGAGCAGCTCCAGGCTTCTGGAATAGCTGCATTGGATGTGAAGAAGCTGAA 

FPIEQLQASGIAALDV K K L K 
131 AGATTCTGGTCTCCACACTGTGGAGGCTGTGGCTTACACTCCAAGGAAAGATCTTCTGCA 

DSGLHTVEAVAYTPRKDLLQ 
241 GAT C AAAGGG AT AAGT G AAGC T AAAGC TG AC AAG AT AAT TG AAGCAGC AT CC AAG AT AG T 

IKGI SEAKADKI ISAASKIV 
301 TCCACTGGGATTTACAAGTGCCAGTCAACTTCATGCGCA.GCGACTGGAGATTATTCAAGT 

PLGFTSASQLHAQRLEI I Q V 
361 TACAACTGGATCAAGAGAGCTTGATAAGATATTGGAGGGTGGGATAGAAACAGGATCTAT 

TTGSRELDKILEGG IETGSI 
421 CACTGAGATATATGGTGAGTTCCGCTCTGGAAAGACTCAGTTGTGTCACACCCCTTGTGT 

TEIYGEFRSGKTQLCHTPCV 
4 81 TACATGTCAGCTTCCACTGGACCAGGGTGGTGGTGAAGGAAAGGCTCTATATATTGACGC 

TCQLPLDQGGGEGKALYIDA 
541 AGAGGGTACATTCAGACCACAAAGGCTCTTGCAGATTGCTGACAGGTTTGGACTGAATGG 

EGTFRPQRLLQIADRFGLNG 
601 TGCTGATGTGTTAGAGAATGTGGCTTATGCCAGAGCTTATAATACGGATCATCAATCTAG 

ADVLENVAYARAYNTDHQSR 
661 ACTTCTGCTGGAAGCAGCTTCCATGATGATAGAGACCAGGTTTGCTCTTATGGTTGTAGA 

LLLEAASMMIETRFALMVVD 
721 CAGTGCCACAGCTCTGTACAGAACTGATTTCTCAGGAAGAGGGGAACTATCAGCGAGGCA 

SATALYRTDFSGRGELSARQ 
7 81 AATGCACATGGCTAAGTTCCTGAGGAGCCTTCAGAAGTTAGCTGATGAGTTTGGAGTAGC 

MHMAKFLRSLQKLADEFGVA 
341 TGTGGTTATCACCAATCAAGTAGTGGCCCAAGTGGATGGATCTGCTATGTTTGCTGGACC 

VVITNQVVAQVDGSAMFAGP 
901 GCAGTTCAAGCCCATTGGTGGAAACATCATGGCTCATGCTTCAACCACAAGGCTTGCTCT 

QFKPI GGNIMAHASTTRLAL 
961 TCGCAAGGGGCGAGGGGAGGAACGGATCTGTAAAGTAATAAGCTCTCCCTGCCTTGCTGA 

RKGRGEERICKVISSPCLAE 
1021 AGCCGAAGCAAGG7TTCAGTTAGCTTCTGAAGGTATTGCAGATGTTAAGGATTGAGACCA 

AEARFQLASEGIADVKD 
1081 TACCTGCTTTA.CAGGCATCTTCAGATCCA.TTGGTCTGCTATTTGCTTTGTCATTCCTTGG 
G.TTAAC (Hpal) 

1141 GCCAACTTTCGTGTTGCCTCACCTTGATGTACAAAACGGTTTCGTTCACATATGTGAATG 
1201 CACGCCTGTGACTGATTTAGGCGTCCTGTTGTAAATAAAACGATGCCTGTTGCCCTGTTG" 
12 61 TGTGTTGCATGTAATCGACAACTCTACATATCACAATTATGATGTA.TTTTAGGTTTTA.TT 
1321 GTTCGC T T AGC AC AGC C AT T GC T G G ATGT G CAAT G T GGG AT TA.T AG AC AA.GAA7CC AC AC 
1381 AACAACAATGGCCAATCCTGATAAAGTAGTTAGTGACTTGGGCAAATAGCATTGTGGTGA 
14 41 TC T T TG AGT T C AC T T GT G AT AAG AAC AGGG CT GGT GG CT G GT GGT G AAAAC T AAC T T G T G 
1501 ATC GG AAC AGGT T TAAT AGGG AAAACTAAGG ATT C TAT AAAAAAAAAAAT AAAAAAAAAA 
1561 AAAAAAAA 
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TABLE II 

FULL LENGTH cDNA AND 
CORRESPONDING AMINO ACID SEQUENCE FOR ZmRADSIB 

LOCUS ZmRADSIB (Sequence corresponding to cDNA insert in PHP7983) 

FEATURES peptide from 53 to 1072 
ORIGIN Zea mays A632 line 

1 GAATTCGGCACGAGATTTTTTGCCGCTTCGGAGGCACCTTCGAACAAAGCCCAAAAGCAG 
61 CCAGCGCACCGCATGTCCTCGTCTTCGGCGCACCAGAAGGCGTCGCCGCCGATAGAGGAG 
MSSSSAHQKASPPIEE 
GCATCC [BamHI) 

121 GAAGCGACGGAGCACGGACCCTTCCCCATCGAACAGCTACAGGCATCTGGAATAGCTGCA 

EATEHGPFPIEQLQASGIAA 
181 CTTGATGTGAAAAAACTCAAAGATGCTGGTCTCTGCACAGTGGAATCTGTAGCATACTCT 

LDVKKLKDAGLCTVESVAYS 
241 C C AAG G AAAG AC C T T T T G C AAAT T AAAGGG AT T AG T G AAG CC AAAG T C G AC AAG AT AAT T 

PRKDLLQIKGISEAKVDKII 
301 GAAGCAGCTTCCAAGTTGGTTCCACTCGGATTTACTAGTGCTAGCCAACTTCATGCACAG 

EAASKLVPLGFTSASQLHAQ 

3 61 AGACTTGAGATCATCCAGCTTACAACTGGATCTAGAGAGCTTGATCAAATTTTGGACGGT 

R L E I IQLTTGSRELDQILDG 

4 21 GGAATAGAAACAGGATCTATCACAGAGATGTATGGTGAATTTCGCTCCGGGAAGACTCAG 

GI ETGS ITEMYGEFRSGKTQ 

4 81 TTGTGCCACACTCTCTGTGTCACATGTCAGCTCCCATTGGACCAAGGTGGTGGTGAAGGA 

LCHTLCVTCQLPLDQGGGEG 

5 a i_ nji q GCTTTGT AT A T T GAT G C A GAG G G T AC AT T CAGG C C T C AAAG AAT T C T C C AG AT A G C A 

KALYI DAEGTFRPQRILQIA 
601 GACAGGTTTGGCTTGAATGGCGCTGATGTACTAGAGAATGTGGCTTATGCCAGAGCATAT 

DRFGLNGADVLENVAYARAY 
661 AACACTGATCATCAATCAAGACTTTTGCTAGAAGCAGCCTCCATGATGGTAGAGACCAGG 

NTDHQS RLLLEAASMMVETR 
721 TTTGCTCTCATGGTTGTGGATAGTGCTACAGCCCTTTACAGAACTGATTTCTCTGGTAGA 

FALMVVD5ATALYRT DFSGR 

7 81 GGGGAGCTATCAGCAAGGCAGATGCATCTGGCGAAGTTTCTTAGGAGCCTTCAAAAGTTA 

GELSARQMHLAKFLRSLQKL 

8 41 GCAGATGAGTTTGGAGTGGCAGTGGTAATCACGAACCAAGTAGTGGCTCAAGTGGATGGT 

ADE FGVAVVITNQVVAQVDG 
901 G C T GC AAT GTTTGCTGGG C C ACAG AT C AAG C C C AT T G GAG G G AAC AT C AT G G C T C AT G CT 

AAMFAGPQIKPIGGNIMAHA 
961 TCCACAACTAGGCTCTTTCTTCGCAAGGGAAGAGGGGAGGAGCGGATCTGCAAAGTAATC 
STTRLFLRKGRGEERICKVI 
1021 AGCTCTCCCTGCCTGGCTGAAGCTGAAGCAAGGTTTCAGATATCATCTGAGGGTGTCACT 

SS PCLAEAEARFQI SSEGVT 
1081 GATGTCAAGGACTGAAAGCATCCTCATTTGCAGTCCACAGCATAACTTGCCAATTCAGAC 
D V K D GTTAA.C (Hpal) 

1141 GAATCTCTGATCTGCTGCACTCGTGTCGGTCCCTTGTACAATCAAAATACCAGTACAGGC 
1201 TTCCAGAATGCGAATGCAAATCCGTTGGAGTGTGGCACTGTCATCCTGTTGTCTTTAGGT 
12 61 ACCATCTAAAGTTGGCATTGTTGTAAAGTGGTAGAGCGCAAGGCTCTACTTTGTAGCCGT 
1321 GGATTCGAGCCCTATGGTGGGCGTTATTTAATTTTTTTGGCGAAAAAGCCTTTAATTGAG 
1381 TTGTTTAGGTGATATGAATAACTCTTTAGGTCATGGAGTTCGACTCCATGGGAGTTTAAG 
14 41 CTGGGTTAAAAAAAATTATGGTCACGATCTTTTTCACATGGGCTACTGTAACATCTCGTC 
1501 TACTCCTGAACCGATGTTAAGCTTTTTAGGACTATAGATCATCTTCATATATCAACAAAA 
1561 AAAAAAAAAAAAAA 
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TABLE III 

RAD51A SEQUENCE-SPECIFIC PROBE FRAGMENT 

5' -CCA 

TACCTGCTTT ACAGGCATC T TCAGATCCAT TGGTCTGCTA TTTGCTTTGT CATTCCTTGG 
GCCAACTTTC GTGTTGCCTC ACCTTGATGT ACAAAACGGT TTCGTTCACA TATGTGAATG 
CACGCCTGTG ACTGATTTAG GCGTCCTGTT G T AAA T AAAA CGATGCCTGT TGCCCTGTTG 
TGTGTTGCAT GTAATCGACA ACTCTACATA TCACAATTAT GATGTATTTT AGGTTTTATT 
GTTCGCTTAG CACAGCCATT GCTGGATGTG CAATGTGGGA TTATAGACAA GAATCCACAC 
AACAACAATG GCCAATCCTG ATAAAGTAGT TAGTGACTTG GGCAAATAGC ATTGTGGTGA 
TCTTTGAGTT CACTTG7GAT AAGAACAGGG CTGGTGGCTG GTGGTGAAAA CTAACTTGTG 
ATCGGAACAG GTTTAATAGG GAAAACTAAG GATTCTATAA AAAAAAAAAT AAAAAAAAAA 
AAAAAAAAC T CGAGGGGGGG CCCGGTACCC AATTCGCCCT ATAGTGAGTG AG7CGTA7TA 
CAATTCAC7G GCCGT CGTTT 7ACAACG7CG TGAC7GGGA -3' 



TABLE IV 

RAD51B SEQUENCE-SPECIFIC PROBE FRAGMENT 

5'- CA TCCTCATTTG CAG7CCACAG CA7AAC77GC CAA77CAGAC 
GAA7C7C7GA TCTGCTGCAC TCGTGTCGGT CCC77G7ACA ATCAAAATAC CAG7ACAGGC 
77CCAGAATG CGAA7GCAAA 7CCG77GGAG 7G7GGCACTG TCATCCTGTT GTCTTTAGG7 
ACCA7C7AAA GTTGGCATTG T7G7AAAG7G GTAGAGCGCA AGGC7CTACT 77GTAGCCGT 
GGAT7CGAGC CCTATGGTGG GCGTTATTTA ATTTTTTTGG CGAAAAAGCC TTTAATTGAG 
77G777AGGT GATA7GAA7A AC7CT77AGG T CAT GG AG TT CGACTCCATG GGAGTTTAAG 
CTGGGTTAAA AAAAATTATG GTCACGATCT TTTTCACATG GGCTACTGTA ACATCTCGTC 
TACTCCTGAA CCGATGTTAA GCTTTT7AGG AC7ATAGA7C ATCTTCATAT ATCAACAAAA 
AAAAAAAAAA AAAAC7CGAG GGGGGGCCCG G7ACCCAAT7 CGCCCTATAG TGAGTGAGTC 
GTATTACAAT TCACTGGCCG TCGTTTTACA ACGTCGTGAC TGGGA-3 / 
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TABLE V 

Polypeptide Sequence Similarities 
Between ZmRADSIA And ZmRADSIB and 
RAD51 Homologs From Other Higher Eukaryotes 

5 



SOURCE OF 
RAD51 SEQUENCE 


ZmRADSIA 
% similarity % identity 


ZmRADSIB 
% similarity % identity 


ZmRADSIA 


100.00 


100.00 


94.12 


90.00 


ZmRADSIB 


94.12 


90.00 


100.00 


100.00 


Tomato 


92.65 


86.76 


94.12 


89.12 


Human 


83.00 


70.00 


81.12 


69.03 


Mouse 


82.79 


69.40 


81.12 


69.03 


Chicken 


81.60 


68.84 


80.53 


68.73 
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MARKER DISTANCE 



P1057A 


23.8 


P1158A 


13.8 


P9112A 


12.2 


PI 054 A 


29.2 


P1157A 


1.5 


PI 147 A 


7.7 


P5533A 


15.3 


P9263A 


13.1 


P9240A 


6.2 


P8057A 


2.1 


P1173A 


3.1 


P1033A 


20.9 


P1036A 


3.4 


P5564A 


3.6 


P554A 


22.9 


P3871A 


4.5 


P1059A 


23.1 


P1037A 


12.7 


P1129A 


10.1 



TABLE VI 



ALIAS 


BIN 


hnl 1 S 40 


7 01/7 m 


umc98 


7 CP 




7 09 


bnl 15 21 


7 03 


nmf 110 


7 0^ 


UlilL- JU 










7 0,4 


npi240 


7 04 


rad51 A 




limp 1 9ST3 


7 O/l 


hnlR 39 


7 Ozl 


hnift 3Q 


7 OS 


jc943 




jc878 




umcl51 


7.05 


bn!16.06 


7.05 


bnl8.44 


7.06 


umc35 


7.06 
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MARKER 

P1035A 
P1152A 
P8058A 
P1044A 
P1123A 
P1102A 
P1017A 
P1185A 
P9257A 
PI 140a 
P9457A 



DISTANCE 

25.3 

0.0 

5.4 
10.6 

7.1 

3.2 
10.7 

0.0 
16.3 
19.2 



ALIAS 

bnl8.35 
umclO 
rad51B 
bnll0.24 
umc60 
umc82 
bnl6.16 
umcl7 
npi257 
umc63 
npi457 



BIN 

3.03 



3.06 
3.06 

3.07 
3.07 
3.07 
3.08 
3.09 
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What is claimed is: 

1 . An isolated polynucleotide comprising a member selected from the group consisting 
of: 

a) a polynucleotide encoding a polypeptide selected from the group consisting 
5 of SEQ ID NO: 3 and SEQ ID NO: 7; 

b) a polynucleotide having at least 90 % identity to a polynucleotide of (a); 

c) a polynucleotide which is complementary under conditions of high 
stringency to said polynucleotide of (a) or (b): and 

d) a polynucleotide comprising at least 15 contiguous nucleotides from a 
1 0 polynucleotide of (a), (b) or (c). 

2. The isolated polynucleotide of claim 1 , wherein said polynucleotide has a sequence 
selected from the group consisting of SEQ ID NO: 2 and SEQ ID NO: 6. 

3. An expression cassette comprising a polynucleotide of claim 1 operably linked to a 
promoter. 

15 4. The host cell transfected with an expression cassette of claim 3 . 

5. The host cell of claim 4, wherein said host cell is a bacterial cell. 

6. The host cell of claim 4, wherein said host cell is a sorghum or maize cell 

7. An isolated protein comprising a polypeptide of at least 10 contiguous amino acids 
encoded by a polynucleotide of claim 2. 

20 8. The protein of claim 7, wherein said polypeptide has a sequence of SEQ ID NO: 3 
or SEQ ID NO: 7. 

9. An isolated polynucleotide comprising a polynucleotide amplified from a Zea mays 
nucleic acid library using the primers selected from the group consisting of: SEQ ED 
NOS: 12 and 13, SEQ ID NOS: 14 and 19, SEQ ID NOS: 14 and 20, and SEQ ID 

25 NOS: 14 and 15, or complements thereof. 

10. The isolated nucleic acid of claim 9, wherein said nucleic acid library is a cDNA 
library. 

11. A transgenic plant comprising an isolated polynucleotide of claim 1 . 

12. A transgenic seed from the transgenic plant of claim 11. 

30 13. A transgenic plant cell comprising an isolated polynucleotide of claim 1 . 

14. An RFLP probe for a maize recombinase gene comprising at least 15 nucleotide 
residues of SEQ ID NO: 4, SEQ ID NO: 8, SEQ ID NO: 9. SEQ ID NO: 10, or 
SEQ ID NO: 11. 
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15. - A nuclear localization sequence comprising the first 10 to 40 amino acids of SEQ ID 

NO:3or SEQ ID NO:7. 

16. The nuclear localization sequence of claim 15 wherein the nuclear localization 
sequence comprises nucleotide 53 to 113 of SEQ ID NO:l or nucleotide 73 to 132 
of SEQ ID NO:5. 

17. A method of making maize recombinase comprising the steps of: 

a) transforming or transfecting a host cell with the vector of claim 3; and 

b) purifying the recombinase from the host cell. 

18. The method of claim 17, wherein the host cell is selected from the group consisting 
of a bacterial cell, a plant cell, a mammalian cell and a yeast cell. 

19. A method of modulating ZmRAD 51 activity in a plant, comprising: 

(a) introducing into a plant cell with an expression cassette comprising a ZmRAD 
5 1 polynucleotide of claim 1 operably linked to a promoter; 

(b) culturing the plant cell under plant cell growing conditions; and 

(c) inducing expression of said polynucleotide for a time sufficient to modulate 
ZmRADSl activity in said plant. 
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SEQUENCE LISTING 

<110> Pioneer Hi-Bred 

International, Inc. 

<120> Nucleotide Sequences Encoding Maize 
RAD 51 

<130> 0556 

<150> US 60/074743 
<151> 1998-02-13 

<160> 20 

<170> FastSEQ for Windows Version 3.0 

<210> 1 
<211> 1568 
<212> DNA 
<213> Zea mays 

<220> 
<221> CDS 

<222> (53) . . . (1072) 
<400> 1 

ggcacgagtt cgaacagggg cagaggtgag acttgagaga aggaagaagg tc atg teg 58 

Met Ser 
1 

teg gcg gcg cag cag cag cag aaa gcg gcg gca gcg gag cag gag gag 106 
Ser Ala Ala Gin Gin Gin Gin Lys Ala Ala Ala Ala Glu Gin Glu Glu 
5 10 15 

gtg gag cac ggg cca ttc ccc ate gag cag etc cag get tct gga ata 154 
Val Glu His Gly Pro Phe Pro lie Glu Gin Leu Gin Ala Ser Gly lie 
20 25 30 

get gca ttg gat gtg aag aag ctg aaa gat tct ggt etc cac act gtg 202 
Ala Ala Leu Asp Val Lys Lys Leu Lys Asp Ser Gly Leu Kis Thr Val 
35 40 45 50 

gag get gtg get tac act cca agg aaa gat ctt ctg cag ate aaa ggg 2 50 

Glu Ala Val Ala Tyr Thr Pro Arg Lys Asp Leu Leu Gin lie Lys Gly 
55 60 65 

ata age gaa get aaa get gac aag ata att gaa gca gca tec aag ata 298 
lie Ser Glu Ala Lys Ala Asp Lys He He Glu Ala Ala Ser Lys He 
70 75 80 

gtt cca ctg gga ttt aca agt gee agt caa ctt cat gcg cag cga ctg 346 
Val Pro Leu Gly Phe Thr Ser Ala Ser Gin Leu His Ala Gin Arg Leu 

85 90 95 

gag att att caa gtt aca act gga tea aga gag ctt gat aag ata ttg 394 
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Glu He He Gin Val Thr Thr Gly Ser Arg Glu Leu Asp Lys He Leu 
100 105 110 

gag ggt ggg aca gaa aca gga tct ate act gag ata tat ggt gag ccc 442 
Glu Gly Gly He Glu Thr Gly Ser He Thr Glu He Tyr Gly Glu Phe 
115 120 125 130 

cgc tct gga aag act cag ttg tgt cac acc cct tgt get aca tgt cag 490 
Arg Ser Gly Lys Thr Gin Leu Cys His Thr Pro Cys Val Thr Cys Gin 
135 140 145 

cct cca ctg gac cag ggt ggt ggt gaa gga aag get eta tat att gac 538 
Leu Pro Leu Asp Gin Gly Gly Gly Glu Gly Lys Ala Leu Tyr He Asp 
150 155 160 

gca gag ggt aca ttc aga cca caa age etc ttg cag att get gac agg 586 
Ala Glu Gly Thr Phe Arg Pro Gin Arg Leu Leu Gin He Ala Asp Arg 
165 170 175 

ttt gga ctg aat ggt get gat gtg tta gag aat g.tg get tat gee aga 634 
Phe Gly Leu Asn Gly Ala Asp Val Leu Glu Asn Val Ala Tyr Ala Arg 
180 185 190 

get tat aat acg gat cat caa tct aga ctt ctg ctg gaa gca get tec 682 
Ala Tyr Asn Thr Asp His Gin Ser Arg Leu Leu Leu Giu Ala Ala Ser 
195 200 205 210 

atg atg ata gag acc agg ttt get ctt atg gtt gta gac agt gec aca 730 
Met Met He Glu Thr Arg Phe Ala Leu Met Val Val Asp Ser Ala Thr 
215 220 225 

get ctg tac aga act gat ttc tea gga aga ggg gaa eta tea gcg agg 778 
Ala Leu Tyr Arg Thr Asp Phe Ser Gly Arg Gly Glu Leu Ser Ala Arg 
230 235 " ' 240 

caa atg cac atg get aag ttc ctg agg age ctt cag aag tta get gat 826 
Gin Met His Met Ala Lys Phe Leu Arg Ser Leu Gin Lys Leu Ala Asp 
245 250 255 

gag ttt gga gta get gtg gtt ate acc aat caa gta gtg gee caa gtg 874 
Glu Phe Gly Val Ala Val Val He Thr Asn Gin Val Val Ala Gin Val 
260 265 270 

gat gga tct get atg ttt get gga ccg cag ttc aag ccc att ggt gga 922 
Asp Gly Ser Ala Met Phe Ala Gly Pro Gin Phe Lys Pro He Gly Gly 
275 280 285 290 

aac ate atg get cat get tea acc aca agg ctt get ctt cgc aag ggg 970 
Asn He Met Ala His Ala Ser Thr Thr Arg Leu Ala Leu Arg Lys Gly 
295 300 * 305 

cga ggg gag gaa ccg ate tgt aaa gta ata age tct ccc tgc ctt get 1018 
Arg Gly Glu Glu Arg lie Cys Lys Val He Ser Ser Pro Cys Leu Ala 
310 315 320 

gaa gee gaa gca agg ttt cag tta get tct gaa ggt att gca gat gtt 1066 

Glu Ala Glu Ala Arg Phe Gin Leu Ala Ser Glu Gly He Ala Asp Val 
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325 



330 



335 



aag gat tgagaccata cctgctttac aggcatcttc agatccattg gtctgctatt 
Lys Asp 

340 



1122 



tgctttgtca 
cgt ccacata 
atgcctgttg 
tgtatcttag 
atagacaaga 
caaatagcat 
ggtgaaaacc 
aaaaaaataa 



ttccttgggc 
tgtgaatgca 
ccctgttgtg 
gttttatcgt 
atccacacaa 
tgtggngatc 
aacttgtgat 
aaaaaaaaaa 



caactttcgt 
cgcctgtgac 
tgttgcatgt 
tcgcctagca 
caacaatggc 
tttgagttca 
cggaacaggt 
aaaaaa 



gtcgccccac 
tgat~ taggc 
aatcgacaac 
cagccattgc 
caatcctgat 
ctngtgataa 
ttaataggga 



cttgatgtac 
grcecgttgt 
tctacatauc 
tggatgtgca 
aaagtagtta 
gaacagggct 
aaactaagga 



aaaacggttt 
aaa caaaacg 
acaattatga 
atgtgggact 
gtgacttggg 
ggtggctggt 
ttctataaaa 



1132 
1242 
1302 
1362 
1422 
1482 
1542 
1568 



<210> 2 

<211> 1020 

<212> DNA 

<213> Zea mays 



<400> 2 

atgtcgtcgg cggcgcagca gcagcagaaa gcggcggcag cggagcagga ggaggtggag 6 0 

cacgggccat tccccatcga gcagctccag gcttctggaa tagcrgcatr ggatgtgaag 120 

aagctgaaag attctggtct ccacactgrg gaggctgtgg cttacactcc aaggaaagat 180 

cttctgcaga tcaaagggat aagtgaagct aaagctgaca agataattga agcagcatcc 240 

aagatagttc cactgggatt tacaagtgcc agtcaacttc atgcgcagcg actggagatt 300 

attcaagtta caactggatc aagagagctt gataagatat tggagggtgg gatagaaaca 360 

ggatctatca ctgagataca tggtgagttc cgctctggaa agactcagtt. gtgtcacacc 420 

ccttgtgtta catgtcagct tccactggac cagggtggtg gtgaaggaaa ggctctatat 480 

attgacgcag agggtacatt cagaccacaa aggctcttgc agatngccga caggtttgga 54 0 

ctgaatggtg ctgatgngtt agagaatgrg gcttatgcca gagcttataa tacggatcat 6 00 

caatctagac ttctgctgga agcagcttcc atgatgatag agaccaggtt tgctcttatg 660 

gttgtagaca gtgccacagc tctgtacaga actgatttct caggaagagg ggaactatca 720 

gcgaggcaaa tgcacatggc taagttcctg aggagccttc agaagttagc tgatgagttt 780 

ggagtagctg tggttatcac caatcaagta gtggcccaag tggatggatc tgctatgttt 840 

gctggaccgc agttcaagcc cattggngga aacatcatgg ctcatgcttc aaccacaagg 900 

cttgctcrtc gcaaggggcg aggggaggaa cggatctgta aagcaataag ctctccctgc 96 0 

cttgctgaag ccgaagcaag gtttcagtta gcttctgaag gtattgcaga tgttaaggat 1020 



<210> 3 

<211> 340 

<212> PRT 

<213> Zea mays 



<400> 3 



Met 


Ser 


Ser 


Ala 


Ala Gin 


Gin 


Gin 


Gin 


Lys 


Ala 


Ala 


Ala 


Ala 


Glu 


Gin 


1 








5 








10 










15 




Glu 


Glu 


Val 


Glu 


Kis Gly 


Pro 


Phe 


Pro 


He 


Glu 


Gin 


Leu 


Gin 


Ala 


Ser 








20 








25 










30 






Gly 


lie 


Ala 


Ala 


Leu Asp 


Val 


Lys 


Lys 


Leu 


Lys 


Asp 


Ser 


Gly 


Leu 


His 






3 5 








40 










45 








Thr 


Val 


Glu 


Ala 


Val Ala 


Tyr 


Thr 


Pro 


Arg 


Lys 


Asp 


Leu 


Leu 


Gin 


He 




50 








55 










60 










Lys 


Gly 


He 


Ser 


Glu Ala 


Lys 


Ala 


Asp 


Lys 


He 


He 


Glu 


Ala 


Ala 


Ser 


65 








70 










75 










80 


Lys 


He 


Val 


Pro 


Leu Gly 


Phe 


Thr 


Ser 


Ala 


Ser 


Gin 


Leu 


His 


Ala 


Gin 










85 








90 










95 
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Arg 


Leu 


Glu 


He 


He 


Gin 


Val 


Thr 


Thr 


Gly 


Ser 


Arg 


Glu 


Leu 


Asp 


Lys 








100 










105 










110 






He 


Leu 


Glu 


Gly 


Gly 


He 


Glu 


Thr 


Gly 


Ser 


L±e 


Thr 


Glu 


He 


Tyr 


Gly 






115 










120 










125 








Glu 


Phe 


Arg 


Ser 


Gly 


Lys 


Thr 


Gin 


Leu 


Cys 


His 


Thr 


Pro 


Cys 


Val 


Thr 




13 0 










135 










140 










Cys 


Gin 


Leu 


Pro 


Leu 


Asp 


Gin 


Gly 


Gly 


Gly 


Glu 


Gly 


Lys 


Ala 


Leu 


Tyr 


145 










150 










155 










160 


He 


Asp 


Ala 


Glu 


Gly 


Thr 


Pne 


Arg 


Pro 


Gin 


Arg 


Leu 


Leu 


Gin 


He 


Aia 










165 










170 










175 




Asp 


A.rg 


Phe 


Gly 


Leu 


Asn 


Gly 


Ala 


Asp 


Val 


Leu 


Glu 


Asn 


Val 


Ala 


Tyr 








180 










185 










190 






Ala 


Arg 


Ala 


Tyr 


Asn 


Thr 


Asp 


His 


Gin 


Ser 


Arg 


Leu 


Leu 


Leu 


Glu 


Ala 






195 










200 










205 








Ala 


Ser 


Met 


Met 


lie 


Glu 


Thr 


Arg 


Phe 


Aia 


Leu 


Met 


Val 


Val 


Asp 


Ser 




210 










215 










220 










Ala 


Thr 


Ala 


Leu 


Tyr 


Arg 


Thr 


Asp 


Phe 


Ser 


Gly 


Arg 


Gly 


Glu 


Leu 


Ser 


225 










230 










235 










240 


Ala 


Arg 


Gin 


Met 


His 


Met 


Ala 


Lys 


Phe 


Leu 


Arg 


Ser 


Leu 


Gin 


Lys 


Leu 










245 










250 










255 




Ala 


ASp 


GlU 


pne 


Gly 


val 


Aia 


vai 


val 


lie 


xnr 


Asn 


Gin 


Val 


Val 


Ala 








260 










265 










270 






Gin 


Val 


Asp 


Gly 


Ser 


Ala 


Met 


Phe 


Ala 


Gly 


Pro 


Gin 


Phe 


Lys 


Pro 


lie 






275 










280 










285 








Gly 


Gly 


Asn 


lie 


Met 


Ala 


His 


Ala 


Ser 


Thr 


Thr 


Arg 


Leu 


Aia 


Leu 


Arg 




290 










295 










300 










Lys 


Gly 


Arg 


Gly 


Glu 


Glu 


Arg 


lie 


Cys 


Lys 


Val 


He 


Ser 


Ser 


Pro 


Cys 


305 










310 










315 










320 


Leu 


Ala 


Glu 


Ala 


Glu 


Ala 


Arg 


Phe 


Gin 


Leu 


Ala 


Ser 


Glu 


Gly 


He 


Ala 



325 330 335 

Asp Val Lys Asp 
340 

<210> 4 
<211> 461 
<212> DNA 
<213> Zea mays 

<400> 4 • 

ccatacctgc tctacaggca tcttcagatc catcggtctg ctatttgctt tgtcattcct 60 

tgggccaact ttcgtgttgc ctcaccttga tgtacaaaac ggtttcgttc acatatgtga 120 

atgcacgcct gtgactgatt taggcgtcct gttgtaaata aaacgatgcc tgttgccctg 180 

ttgtgtgttg catgtaatcg acaactctac atatcacaat tatgatgtat tttaggtttt 240 

attgttcgct tagcacagcc attgctggat gtgcaatgtg ggattataga caagaatcca 300 

cacaacaaca atggccaatc ctgataaagt agttagtgac ttgggcaaat agcattgtgg 360 

tgatctttga gttcacttgt gataagaaca gggctggtgg ctggtggtga aaactaactt 42 0 

gtgatcggaa caggtttaat agggaaaact aaggattcta t 461 

<210> 5 
<211> 1574 
<212> DNA 
<213> Zea mays 

<220> 
<221> CDS 

<222> (73) . . . (1092) 
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<400> 5 

gaattcggca cgagattttt tgccgcttcg gaggcacctt cgaacaaagc ccaaaagcag 6 0 

ccagcgcacc gc atg tec teg tct teg gcg cac cag aag gcg teg ccg ccg 111 
Met Ser Ser Ser Ser Ala His Gin Lys Ala Ser Pro Pro 
15 10 

ata gag gag gaa gcg acg gag cac gga ccc ttc cce ate gaa cag eta 159 

lie Glu Glu Glu Ala Thr Glu His Gly Pro Phe Pro He Glu Gin Leu 
15 20 25 

cag gca tct gga ata get gca ctt gat gtg aaa aaa etc aaa gat get 207 
Gin Ala Ser Gly He Ala Ala Leu Asp Val Lys Lys Leu Lys Asp Ala 
30 35 40 45 

ggt etc tgc aca gtg gaa tct gta gca tac tct cca agg aaa gac ctt 255 
Gly Leu Cys Thr Val Glu Ser Val Ala Tyr Ser Pro Arg Lys Asp Leu 
50 55 60 

ttg caa att aaa ggg att agt gaa gec aaa gtc gac aag ata att gaa 303 
Leu Gin He Lys Gly He Ser Glu Ala Lys Val Asp Lys He He Glu 
65 70 75 

gca get tec aag ttg gtt cca etc gga ttt act agt get age caa ctt 351 
Ala Ala Ser Lys Leu Val Pro Leu Gly Phe Thr Ser Ala Ser Gin Leu 
80 85 90 

cat gca cag aga ctt gag ate ate cag ctt aca act gga tct aga gag 3 99 

His Ala Gin Arg Leu Glu He lie Gin Leu Thr Thr Gly Ser Arg Glu 
95 100 105 

ctt gat caa att ttg gac ggt gga ata gaa aca gga tct ate aca gag 447 
Leu Asp Gin He Leu Asp Gly Gly He Glu Thr Gly Ser He Thr Glu 
110 115 120 125 

atg tat ggt gaa ttt cgc tec ggg aag act cag ttg tgc cac act etc 495 
Met Tyr Gly Glu Phe Arg Ser Gly Lys Thr Gin Leu Cys His Thr Leu 
130 135 140 

tgt gtc aca tgc -cag etc cca ttg gac caa ggt ggt ggt gaa gga aag 54 3 

Cys Val Thr Cys Gin Leu Pro Leu Asp Gin Gly Gly Gly Glu Gly Lys 
145 150 155 

get ttg tat att gat gca gag ggt aca ttc agg cct caa aga att etc 591 
Ala Leu Tyr He Asp Ala Glu Gly Thr Phe Arg Pro Gin Arg He Leu 
160 165 170 

cag ata gca gac agg ttt ggc ttg aat ggc get gat gta eta gag aat 63 9 

Gin He Ala Asp Arg Phe Gly Leu Asn Gly Ala Asp Val Leu Glu Asn 
175 180 185 

gtg get tat gec aga gca tat aac act gat cat caa tea aga etc ccg 687 
Val Ala Tyr Ala Arg Ala Tyr Asn Thr Asp His Gin Ser Arg Leu Leu 
190 195 200 205 



cca gaa gca gec tec atg atg gta gag acc agg ttt get etc atg gtt 
Leu Glu Ala Ala Ser Met Met Val Glu Thr Arg Phe Ala Leu Met Val 
210 215 220 



735 
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gcg gat age get aca gec etc tac aga act gat ttc tec ggt aga ggg 783 

Val Asp Ser Ala Thr Ala Leu Tyr Arg Thr Asp ?he Ser Gly Arg Gly 
225 230 235 

gag eta tea gca agg cag atg cat ctg gcg aag ttc ctt agg age ctt 831 

Glu Leu Ser Ala Arg Gin Met His Leu Ala Lys Phe Leu Arg Ser Leu 
240 245 250 

caa aag tta gca gat gag ttt gga gtg gca gtg gta ate acg aac caa 879 

Gin Lys Leu Ala Asp Glu Phe Gly Val Ala Val Val lie Thr Asn Gin 
255 260 265 

gta gtg get caa gtg gat ggt get gca atg ttt get ggg cca cag ate 927 

Val Val Ala Gin Val Asp Gly Ala Ala Met Phe Ala Gly Pro Gin He 
270 275 280 285 



aag ccc att gga ggg aac ate atg get cat get tec aca act agg etc 
Lys Pro He Gly Gly Asn He Met Ala His Ala Ser Thr Thr Arg Leu 
290 295 300 



975 



ttt ctt cgc aag gga aga ggg gag gag egg ate tgc aaa gta ate age 1023 
Phe Leu Arg Lys Gly Arg Gly Glu Glu Arg He Cys Lys Val He Ser 
305 310 315 

tct ccc tgc ctg get gaa get gaa gca agg ttt cag ata tea tct gag 1071 
Ser Pro Cys Leu Ala Glu Ala Glu Ala Arg Phe Gin He Ser Ser Glu 
320 325 330 

ggt gtc act gat gtc aag gac tgaaagcatc etcatttgea gtccacagca 1122 
Gly Val Thr Asp Val Lys Asp 
335 340 

taacttgeca attcagacga atctctgatc tgctgcactc gcgtcggtcc cttgtacaat 1182 

caaaatacca gtacaggctt ecagaatgeg aatgeaaate cgttggagtg tggcactgtc 1242 

atcctgttgt ctttaggtac catctaaagt tggcattgtt gtaaagcggt agagegcaag 1302 

gctctacttt gtagccgtgg attcgagccc tatggtgggc gttatttaat ttttttggcg 1362 

aaaaagcett taattgagtt gtttaggtga tatgaataac tctttaggtc atggagttcg 1422 

actccatggg agtttaagct gggttaaaaa aaattatggt cacgatcttt ttcacatggg 1482 

ctactgtaac atetegtcta ctcctgaacc gatgttaagc tttttaggac tatagatcat 1542 

cttcatatat caacaaaaaa aaaaaaaaaa aa 1574 

<210> 6 
<211> 1020 
<212> DNA 
<213> Zea mays 

<400> 6 

atgtcctcgt cttcggcgca ecagaaggeg tcgccgccga tagaggagga agegaeggag 60 

cacggaccct tccccatcga acagctacag gcatctggaa tagctgeact tgatgtgaaa 120 

aaactcaaag acgccggccc ctgcacagtg gaatctgtag catactctcc aaggaaagac 180 

ettttgeaaa ttaaagggat tagtgaagce aaagtcgaca agataattga agcagcttcc 240 

aagttggttc cacteggatt tactagtget agccaacttc atgeacagag acttgagate 300 

atccagctta caactggatc tagagagctt gatcaaattt tggacggtgg aatagaaaca 360 

ggatetatea cagagatgta tggtgaattc cgctecggga agacteagtt gtgceacaet 420 

ctctgtgtca catgtcagct cccattggac caaggtggtg gtgaaggaaa ggctttgtat 480 

attgatgeag agggtacatt caggcctcaa agaattctcc agatagcaga caggtttggc 540 
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ttgaatggcg ctgatgtact agagaatgtg gcttatgcca gagcacataa cactgatcat 600 

caatcaagac ttttgctaga agcagcctcc atgatggtag agaccaggtt tgcnctcatg 660 

gttgtggata gtgctacagc cctttacaga actgatttct ctggtagagg ggagctatca 720 

gcaaggcaga tgcatctggc gaagtttctt aggagccttc aaaagttagc agatgagttt 780 

ggagtggcag tggtaatcac gaaccaagta gtggctcaag tggatggtgc tgcaatgctt 840 

gctgggccac agatcaagcc cattggaggg aacatcatgg ctcatgcttc cacaactagg 900 

ctctttcttc gcaagggaag aggggaggag cggatctgca aagtaatcag ctctcccngc 960 

ctggctgaag ctgaagcaag gtttcagata tcanctgagg gtgtcactga tgtcaaggac 1020 

<210> 7 

<211> 340 

<212> PRT 

*<213> Zea mays 

<400> 7 

Met Ser Ser Ser Ser Ala His Gin Lys Ala Ser Pro Pro lie Glu Glu 

15 10 15 

Glu Ala Thr Glu His Gly Pro Phe Pro lie Glu Gin Leu Gin Ala Ser 

20 25 30 

Gly lie Ala Ala Leu Asp Val Lys Lys Leu Lys Asp Ala Gly Leu Cys 

35 40 45 

Thr Val Glu Ser Val Ala Tyr Ser Pro Arg Lys Asp Leu Leu Gin lie 

50 55 60 

Lys Gly lie Ser Glu Ala Lys Val Asp Lys lie lie Glu Ala Ala Ser 
65 70 75 80 

Lys Leu Val Pro Leu Gly Phe Thr Ser Ala Ser Gin Leu His Ala Gin 

85 90 95 

Arg Leu Glu lie lie Gin Leu Thr Thr Gly Ser Arg Glu Leu Asp Gin 

100 105 110 

He Leu Asp Gly Gly He Glu Thr Gly Ser lie Thr Glu Met Tyr Gly 

115 120 125 

Glu Phe Arg Ser Gly Lys Thr Gin Leu Cys His Thr Leu Cys Val Thr 

130 135 m 140 

Cys Gin Leu Pro Leu Asp Gin Gly Gly Gly Glu Gly Lys Ala Leu Tyr 
145 150 155 160 

He Asp Ala Glu Gly Thr Phe Arg Pro Gin Arg He Leu Gin He Ala 

165 170 175 

Asp Arg Phe Gly Leu Asn Gly Ala Asp Val Leu Glu Asn Val Ala Tyr 

180 185 190 

Ala Arg Ala Tyr Asn Thr Asp His Gin Ser Arg Leu Leu Leu Glu Ala 

195 200 205 

Ala Ser Met Met Val Glu Thr Arg Phe Ala Leu Met Val Val Asp Ser 

210 215 220 

Ala Thr Ala Leu Tyr Arg Thr Asp Phe Ser Gly Arg Gly Glu Leu Ser 
225 230 235 240 

Ala Arg Gin Met His Leu Ala Lys Phe Leu Arg Ser Leu Gin Lys Leu 

245 250 255 

Ala Asp Glu Phe Gly Val Ala Val Val He Thr Asn Gin Val Val Ala 

260 265 270 

Gin Val Asp Gly Ala Ala Met Phe Ala Gly Pro Gin He Lys Pro He 

275 280 285 

Gly Gly Asn He Met Ala His Ala Ser Thr Thr Arg Leu Phe Leu Arg 

290 295 300 

Lys Gly Arg Gly Glu Glu Arg He Cys Lys Val He Ser Ser Pro Cys 
305 310 315 320 

Leu Ala Glu Ala Glu Ala Arg Phe Gin He Ser Ser Glu Gly Val Thr 
325 330 335 
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Asp Val Lys Asp 
340 



<210> 8 

<211> 453 

<2I2> DNA 

<213> Zea mays 



<400> 8 

catcctcatt tgcagtccac agcataactt gccaattcag acgaatctct gatctgctgc 60 

actcgtgtcg gtcccttgta caatcaaaat accagtacag gcttccagaa tgcgaatgca 120 

aatccgttgg agtgtggcac tgtcatcctg ttgtctttag gtaccatcta aagttggcat 180 

tgttgtaaag tggtagagcg caaggctcta ctttgtagcc gtggattcga gccctatggt 240 

gggcgttatt taattttttt ggcgaaaaag cctttaattg agttgtctag gtgatatgaa 300 

taactcttta ggtcatggag ttcgactcca tgggagttna agctgggtta aaaaaaatta 360 

tggtcacgat ctttttcaca tgggctactg taacatcccg tctactcctg aaccgatgtt 420 

aagcttttta ggactataga tcatcttcat atatcaac 458 



<210> 9 

<211> 582 

<212> DNA 

<213> Zea mays 



<400> 9 

ccatacctgc tttacaggca tcttcagatc cattggtctg ctatttgctt tgtcattcct 60 

tgggccaact ttcgtgttgc ctcaccttga tgtacaaaac ggtttcgttc acatatgtga 120 

atgcacgcct gtgactgatt taggcgtcct gttgtaaata aaacgatgcc tgttgccctg 180 

ttgtgtgttg catgtaatcg acaactctac atatcacaat tatgatgtat tttaggtttt 240 

attgttcgct tagcacagcc attgctggat gtgcaatgtg ggattataga caagaatcca 300 

cacaacaaca atggccaatc ctgataaagt agttagtgac ttgggcaaat agcattgtgg 360 

tgatctttga gttcacttgt gataagaaca gggctggtgg ctggtggtga aaactaactt 420 

gtgatcggaa caggtttaat agggaaaact aaggattcta taaaaaaaaa aataaaaaaa 480 

aaaaaaaaaa actcgagggg gggcccggta cccaattcgc cctatagtga gtgagtcgta 54 0 

ttacaattca ctggccgtcg ttttacaacg tcgtgactgg ga 582 



<210> 10 
<211> 567 
<212> DNA 
<213> Zea mays 

<400> 10 

catcctcatt tgcagtccac agcataactt gccaattcag acgaatctct gatctgctgc 60 

actcgtgtcg gtcccttgta caatcaaaat accagtacag gcttccagaa tgcgaatgca 120 

aatccgttgg agtgtggcac tgtcatcctg ttgtctttag gtaccatcta aagttggcat 180 

tgttgtaaag tggtagagcg caaggctcta ctttgtagcc gtggattcga gccctatggt 240 

gggcgttatt taattttttt ggcgaaaaag cctttaattg agttgtttag gtgatatgaa 300 

taactcttta ggtcatggag ttcgactcca tgggagttta agctgggtta aaaaaaatta 360 

tggtcacgat ctttttcaca tgggctactg taacatctcg tctactcctg aaccgatgtt 420 

aagcttttta ggactataga tcatcttcat atatcaacaa aaaaaaaaaa aaaaaactcg 480 

agggggggcc cggtacccaa ttcgccctat agtgagtgag tcgtactaca attcactggc 54 0 

cgtcgtttta caacgtcgtg actggga 567 



<210> 11 

<211> 360 

<212> DNA 

<213> Zea mays 
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<400> 11 

acattcagac cacaaaggct cttgcagatt gctgacaggt ttggactgaa tggtgctgat 60 

gtgttagaga atgtggctta tgccagagct tataatacgg atcatcaatc tagacttctg 120 

ctggaagcag ctcccatgat gatagagacc aggtttactc ttatggttgt agacagtgcc 180 

acagctctgt acagaactga tttctcagga agaggggaac tatcagcgag gcaaatgcac 240 

atggctaagt tcctgaggag ccttcagaag ttagctgatg agtttggagt agctgtggtt 300 

atcaccaatc aagtagtggc ccaagtggat ggatctacta tgtttgccgg gccgcagttc 360 

<210> 12 
<211> 38 
<212> DNA 
<213> Zea mays 

<400> 12 

tatagaattc cacaaaggct cttgcagatt gctgacag 38 

<210> 13 
<211> 37 
<212> DNA 
<213> Zea mays 

<400> 13 

atactcgagg cccagcaaac atagtagatc catccac 3 7 

<210> 14 
<211> 24 
<212> DNA 
<213> Zea mays 

<400> 14 

tcccagtcac gacgttgtaa aacg 24 

<210> 15 
<211> 36 
<212> DNA 
<213> Zea mays 

<400> 15 

a 9 c 99ataac aatttcacac aggaaacagc tatgac 36 

<210> 16 

<211> 51 

<212> DNA 

<213> Zea mays 

<400> 16 

gtattgcaga tgttaaggat tgagaccata cctggttaac aggcatctca g 51 

<210> 17 
<211> 27 
<212> DNA 
<213> Zea mays 

<400> 17 

gcagccaggg atccacatgt cctcgtc 27 



<210> 18 
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- 10- 

<211> 52 
<212> DNA 
<213> Zea mays 

<400> 18 

ctgatgtcaa ggactgaaag catcctcatt tgcagtcaac agcataactt gc 52 

<210> 19 
<211> 22 
<212> DNA 
<213> 2ea mays 

<400> 19 

ccatacctgc tntacaggca tc 22 

<210> 20 
<211> 22 
<212> DNA 
<213> Zea mays 



<400> 20 
catcctcatt tgsagtccac ag 
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